date:20120522

Re: [HACKERS] Missing optimization when filters are applied after window functions

2012-05-22 Thread Hitoshi Harada

On Thu, May 17, 2012 at 7:26 AM, Volker Grabsch  wrote:
> Hitoshi Harada schrieb:
>> On Wed, May 16, 2012 at 12:50 AM, Volker Grabsch  
>> wrote:
>> > I propose the following general optimization: If all window
>> > functions are partitioned by the same first field (here: id),
>> > then any filter on that field should be executed before
>> > WindowAgg. So a query like this:
>>
>> I think that's possible.  Currently the planner doesn't think any
>> qualification from the upper query can be pushed down to a query that
>> has a window function.  It would be possible to let it push down if
>> the expression matches PARTITION BY expression.
>
> Sounds great!
>
>> However, the
>> challenge is that a query may have a number of window functions that
>> have different PARTITION BY expressions.  At the time of pushing down
>> in the planning, it is not obvious which window function comes first.
>
> I'm don't really unterstand what you mean with "which window function
> comes first", because to my understanding, all window functions of
> a query belong to the same level in the query hierarchy. But then,
> my knowledge of PostgreSQL internals isn't very deep, either.

No, you can specify as many window specification as you like.  Say,

SELECT count(*) over (w1), count(*) over (w2)
FROM tbl
WINDOW w1 AS (PARTITION BY x ORDER BY y),
  w2 AS (PARTITION BY y ORDER BYx);

and in the same query level there are different type of window
specifications.  The code as stands today doesn't have any semantics
about which of w1 or w2 to run first (probably w1 will be run first,
but the query semantics doesn't enforce anything).

>> One idea is to restrict such optimization in only case of single
>> window function, and the other is to make it generalize and cover a
>> lot of cases.
>
> From a practical point of view, the restriction to a single window
> function wouldn't be that bad, although I'd prefer to think about
> the number of different windows rather than number of window functions.
>
> In other words, every optimization that is correct for a single window
> function is also correct for multiple window functions if those use
> all the same window.

Yeah, I mean, multiple windows, not window functions.

>> That said, our planner on window functions has a lot of improvement to
>> be done.  Every kind of optimization I see is what I raised above;
>> they can be done easily by hacking in a small case, or they can be
>> done by generalizing for the most of cases.  My understanding is our
>> project tends to like the latter and it takes a little time but covers
>> more use cases.
>
> I'd also prefer to see a general solution, as this provides less
> room for unpleasant surprises (e.g. "This query is only slightly
> different from the previous one. Why does it take so much longer?").
>
> On the other hand, any small improvement is a big step forward
> regarding window functions.
>
> Unfortunately, I can't voluteer on that, as it is currently
> impossible for me to allocate enough time for this.
>
> However, any pointer to where to look at the source (or in the
> manual) would be of great. Maybe I'll find at least enough time
> to provide a rough proposal, or to improve existing attempts
> to solve this issue.

Look at subquery_is_pushdown_safe in allpath.c.  Here we stop looking
deeper if the upper qualification can be pushed down to the inner
sub-query if the sub-query has any window function expressions.


Thanks,
-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Joel Jacobson

On the topic on fixing pg_dump to dump in a predictable order, can
someone please update me on the current state of the problem?

I've read though pg_dump_sort.c, and note objects are first sorted in
type/name-based ordering, then topologically sorted in a way which
"minimize unnecessary rearrangement".

How come this not always generates a predictable order? Any ideas on
how to fix the problem? If someone gives me a hint I might make an
effort trying to implement the idea.

If pg_dump would dump in a predictable order, it would make sense to
dump all overloaded versions of functions sharing the same name in the
same file.

Then it would be _guaranteed_ two different databases committing their
schema to a shared VCS commit exactly the same files if the schema is
the same, which is not guaranteed unless the dump order is
predictable.

Having thought about it, I agree the idea with arguments in filenames
is, probably possible, but suboptimal.
Much better writing all overloaded functions to the same file and
fixing the predictable dump order problem.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Joel Jacobson

On Wed, May 23, 2012 at 9:09 AM, Tom Lane  wrote:
> Thus, not implausibly, causing the dump to fail entirely on some
> filesystems.  Case sensitivity, encoding issues, and special characters
> in names (eg slashes or backslashes, depending on platform) are
> additional pain points.  This does not sound like a good plan from here.

This is true, which means some users won't be able to use the feature,
because they are using an ancient OS or have function names with slashes,
hm, is it even possible to have function names with slashes?

The maximum length of tables, functions etc in postgres is 63 characters.
A function in postgres can have at most 100 arguments.
The absolute majority of users run operating systems allowing
at least 255 characters, http://en.wikipedia.org/wiki/Comparison_of_file_systems

I suppose you have a lot more experience of what postgres installations exists
in the world. Do you think it's common databases have non-ascii problematic
characters in object names?

Is it a project policy all features of all standard tools must be
useful for all users
on all platforms on all databases? Or is it acceptable if some features are only
useable for, say, 90% of the users?

> Taking a step or two back, it seems to me that the thrust of your
> proposal is essentially to throw away all dump ordering information,
> which does not seem like a particularly good idea either.  It certainly
> will not lead to a dump that can be restored reliably.  If the use-case
> for this is database comparisons, I think we'd be a lot better off to
> write a postprocessing tool for regular dumps to perform such
> comparisons, rather than whacking pg_dump around to the point where it's
> unable to perform its primary function.

Not at all, the ordering information is not thrown away, it is preserved
in the dump file specified by the -f option, from which each split file
is included using \i


Example, this is an extract of the -f dump file in my database:

--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET client_encoding = 'SQL_ASCII';
SET standard_conforming_strings = off;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET escape_string_warning = off;

--
-- Name: pgx_diag; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA pgx_diag;


ALTER SCHEMA pgx_diag OWNER TO postgres;

-- ... some more schemas, languages etc ...
-- ... and then all the included files:

\i /home/postgres/database/gluepay-split/public/TYPE/dblink_pkey_results.sql
\i /home/postgres/database/gluepay-split/public/TYPE/r_matchedwithdrawal.sql
\i /home/postgres/database/gluepay-split/public/TYPE/r_unapprovedwithdrawal.sql
\i 
/home/postgres/database/gluepay-split/public/TYPE/ukaccountvalidationchecktype.sql
\i /home/postgres/database/gluepay-split/aml/FUNCTION/check_name.sql
\i /home/postgres/database/gluepay-split/aml/FUNCTION/describe_entityid.sql
\i /home/postgres/database/gluepay-split/aml/FUNCTION/get_linkid.sql
\i /home/postgres/database/gluepay-split/aml/FUNCTION/set_address.sql
-- ... all the objects ..
\i 
/home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workerid_fkey.sql
\i 
/home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workerstatusid_fkey.sql
\i 
/home/postgres/database/gluepay-split/public/FK_CONSTRAINT/workershistory_workertypeid_fkey.sql


-- .. and after all the included files comes permissions and stuff:

--
-- Name: public; Type: ACL; Schema: -; Owner: postgres
--

REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO PUBLIC;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Bruce Momjian

On Wed, May 23, 2012 at 01:38:06AM +0400, Alexander Korotkov wrote:
> On Wed, May 23, 2012 at 1:26 AM, Bruce Momjian  wrote:
> 
> On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote:
> > "Improve GiST box and point index performance by producing better trees
> with
> > less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas,
> Kevin
> > Grittner)"
> > Is this note about following two commits?
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> > 7f3bd86843e5aad84585a57d3f6b80db3c609916
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> > d50e1251946a6e59092f0a84fc903532eb599a4f
> > These improvements influence not only boxes and points but all
> geometrical
> > datatypes.
> 
> OK, new wording:
> 
>Improve GiST geometric type index performance by producing better
>trees with less memory allocation overhead (Alexander Korotkov)
> 
> 
> Thanks!
> 
> Also, I've some notes about removing reviewers.
> "Improve GiST index build times (Alexander Korotkov)"
> I think Heikki Linnakangas should be also listed as author of that patch
> because he didn't only review and commit, but actually put his hands on code.

OK, Heikki added.

> Isn't my authorship of this patch lost now?
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> 80da9e68fdd70b796b3a7de3821589513596c0f7
> I think earlier this patch was taken into account in entry "Add support for
> range data types". Probably, we need separate entry for this patch?

I thought that was more of a Gist index improvement than a range type
improvement, but I have added your name to the range type item.

Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Tom Lane

Joel Jacobson  writes:
> If the entire function identity arguments would be included in the filename,
> two dumps of the same schema in two different databases
> would be guaranteed to produce the same dump.

> This would render some very long filenames for functions with many arguments,

Thus, not implausibly, causing the dump to fail entirely on some
filesystems.  Case sensitivity, encoding issues, and special characters
in names (eg slashes or backslashes, depending on platform) are
additional pain points.  This does not sound like a good plan from here.

Taking a step or two back, it seems to me that the thrust of your
proposal is essentially to throw away all dump ordering information,
which does not seem like a particularly good idea either.  It certainly
will not lead to a dump that can be restored reliably.  If the use-case
for this is database comparisons, I think we'd be a lot better off to
write a postprocessing tool for regular dumps to perform such
comparisons, rather than whacking pg_dump around to the point where it's
unable to perform its primary function.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

2012-05-22 Thread Tatsuo Ishii

> This seems like a rather expensive solution to a problem that I'm not
> really convinced is real.  Why should a client program care about the
> severity level to a greater extent than whether the message is
> ErrorResponse or NoticeResponse?  In particular, I'm entirely
> unconvinced by your claim that pgpool ought to treat ERROR and FATAL
> cases differently.Whatever it does about session termination ought to
> be driven by the connection closure, not by the content of the message.
> (As a counterexample, what if the backend crashes without delivering any
> message at all?)

Pgpool already detects session termination and handles it.  BTW,
pgpool uses blocking socket and issue select(2), watching
read/exception of the file descriptor before reading from the
socket. It seems even if the backend teminates the session, select()
does not respond immediately but wait forever. I thought in this case
select() returns and reading the socket returns EOF. So it seems
sometimes it's hard to know if the socket was closed by backend. This
is one of the reasons why I don't want entirely rely on the physical
session termination event.

Moreover what pgpool would like to do here is, better service to users
because pgpool is not just a proxy server to PostgreSQL.

For example pgpool does failover if one of PostgreSQL nodes goes 
down. Pgpool can detect postmaster's planning shutdown by watching
healthiess of PostgreSQL but it is much better to trigger it by
detecting PostgreSQL's admin shutdown case, because it is quicker
(less cluster down time) and is more reliable (health check needs to
retry before judging postmaster down because of temporary network
error or some such).

> Moreover, if we did add this starting in 9.3, it would
> still be many years before clients could rely on it being provided,
> which means you'd need some other solution anyway.

Pgpool relies on PostgreSQL and cannot provide services more than what
PostgreSQL does. User can do more if he/she uses newer versin of
PostgreSQL. This can be said to not only this particlular problem but
any other things.

> Another issue is that if we do this, we're essentially (IMO) promising
> that the set of severity codes won't change in the future, which may
> not be a good thing to promise.

Why do you care?  The list of severity is cleary stated in the
document and you cannot arbitality change it without major version
change anyway. There's nothing different thing here from any other
features what PostgreSQL explicitly provides.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Schema version management

2012-05-22 Thread Joel Jacobson

On Wed, May 23, 2012 at 3:24 AM, Daniel Farina  wrote:
> Is there a reason why the current directory format could not be
> adjusted to become more human-readable friendly for mechanical
> reasons?  I realize there is a backwards compatibility problem, but it
> may be better than bloating a new option.

I like your idea, then the format would be directory, while the option
would be something like, --human-friendly?

Currently, the directory format only dumps the data of tables
into separate files. Everything else goes into the toc.dat file.
To make it work, also the stuff written to the toc.dat file must
be written to separate files.

> But I don't see how making the directory output format more
> human-friendly could be seen as a bad thing overall, except in the
> notable axis of implementation complexity.  Silly issues like naming
> files on different platforms, case sensitivity, and file length
> restrictions may rear their ugly head.

If the entire function identity arguments would be included in the filename,
two dumps of the same schema in two different databases
would be guaranteed to produce the same dump.

This would render some very long filenames for functions with many arguments,
but this problem could at least be reduced by using the shorter aliases for each
data type, as "varchar" instead of "character varying" and "timestamptz"
instead of "timestamp with time zone", etc.

http://www.postgresql.org/docs/devel/static/datatype.html#DATATYPE-TABLE

Also, to get even more space, as the name of the function can be long too,
the function name could be made a directory, and the different overloaded types
different files, e.g:

/public/FUNCTION/myfunc/int.sql
/public/FUNCTION/myfunc/int_timestamptz.sql

And functions with no arguments are written to a single file
(suffic .sql to avoid conflict with eventual directory name for function):

/public/FUNCTION/myfunc.sql

> I think about this because in addition to the data types and operators
> defined in the development process, there are often small tables that
> need to be loaded with content and version controlled as well, rather
> like userland-equivalents pg_enum entries.

Is there a term for such tables? I use the term "lookup tables", but perhaps
there is a better one?

In my schema, they typically maps statusids, stateids,
etc to human friendly names.

E.g., if Orders is a huge table for all orders, I might have a
OrderStatuses table to
lookup all the OrderStatusID columns in Orders.
Orders.OrderStatusID -fk-> OrderStatuses.OrderStatusID
OrderStatuses.Name is unqiue and contains the human friendly name of the status.

These small lookup tables also needs to be version controlled of course.

This is a tricky one though, because you might have small tables with base data,
but with references to other huge tables, which you don't want to
include in your
automatically version controlled schema dump.

I solved this problem by creating a quite complex recursive plpgsql function,
resolving all dependencies and joining only the rows from each table required,
allowing you to specify a regex matching a list of tables, which in
turn resolves
to all tables they have references to, and dumps these tables too, but only the
required rows. The result is a dump of each such table into a separate file,
in a restorable order not causing any dependency problems.
Then I have a similar function to do the restoring.

I use this approach to build a restorable clean test database of any version of
the system, may it be the production or some developer's local version of it.

And also, not to forget, to make it work all the sequences also needs
to be restarted
to the same values as in the original database after the dump is restored.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

2012-05-22 Thread Tom Lane

Tatsuo Ishii  writes:
> I described the problem with possibly localized "S" filed of
> ErrorResponse(and NoticeResponse) in Frontend/Backend protocol.
> http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php

> So I would like to propose a solution for this (for 9.3): add new
> field to ErrorResponse and NoticeResponse. The new field will have the
> same value as "S" except that it's never localized. This will not only
> solve the problem I described but possibly reduce the cost to analyze
> the error/notice messages in the frontend programs.

This seems like a rather expensive solution to a problem that I'm not
really convinced is real.  Why should a client program care about the
severity level to a greater extent than whether the message is
ErrorResponse or NoticeResponse?  In particular, I'm entirely
unconvinced by your claim that pgpool ought to treat ERROR and FATAL
cases differently.  Whatever it does about session termination ought to
be driven by the connection closure, not by the content of the message.
(As a counterexample, what if the backend crashes without delivering any
message at all?)  Moreover, if we did add this starting in 9.3, it would
still be many years before clients could rely on it being provided,
which means you'd need some other solution anyway.

Another issue is that if we do this, we're essentially (IMO) promising
that the set of severity codes won't change in the future, which may
not be a good thing to promise.

> BTW, changing existing "S" field not to be localized would work but
> I'm afraid it breaks backward compatibility.

We made it localized intentionally, on the grounds that its principal
and probably sole use was for presentation to human users.  I've not
heard previous complaints about that decision.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Christopher Browne

On Tue, May 22, 2012 at 4:35 PM, Stephen Frost  wrote:
> * Josh Berkus (j...@agliodbs.com) wrote:
>> The local role is preferred, the same way we allow objects in the local
>> schema to overshadow objects in the global schema.
>
> I would think we'd want the exact opposite.  I don't want my global
> 'postgres' user to be overwritten by some local one that the admin of
> this particular DB created..

In object-orientedness, the usual behaviour is for more specific
methods to override the more generic ones, which is reasonable.  I'm
not certain which direction is more to be preferred, whether:
a) To consider the global user as a default, to be overridden if possible, or
b) To consider the local user as the default, to be overridden if possible.
They're both tenable positions.

But I think I agree with Stephen, that what's desirable, with global
users, is to use them as the override.  They're gonna be expensive,
you should get something for the price :-).
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposal: add new field to ErrorResponse and NoticeResponse

2012-05-22 Thread Tatsuo Ishii

I described the problem with possibly localized "S" filed of
ErrorResponse(and NoticeResponse) in Frontend/Backend protocol.
http://archives.postgresql.org/pgsql-hackers/2012-05/msg00912.php

So I would like to propose a solution for this (for 9.3): add new
field to ErrorResponse and NoticeResponse. The new field will have the
same value as "S" except that it's never localized. This will not only
solve the problem I described but possibly reduce the cost to analyze
the error/notice messages in the frontend programs.

Adding new field can be possible without breaking current version of
Frontend/Backend protocol since the protocol is extensible in
this area:
"Since more field types might be added in future, frontends should
silently ignore fields of unrecognized type."(from "46.5. Message
Formats" of PostgreSQL 9.2 documentation)

BTW, changing existing "S" field not to be localized would work but
I'm afraid it breaks backward compatibility.

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Josh Berkus


> That's only true if you try to satisfy both goals at once, which I'm
> not suggesting. So I believe that proposition to be false.

Oh, ok.  Per your original email and follow-up arguments, you seemed to
be doing just that.

>> An alternative idea -- and one which could be deployed a lot faster --
>> is to come up with a tool which makes it easy to migrate an entire
>> database into a separate schema or set of schemas in an existing
>> database.   And improvements to manage schema visility/path better, I
>> suppose.
> 
> Yes, it is possible to improve things there also.

Feh, and nested schema, for that matter.  So, there's a fair bit of work
wherever we bite it off.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Josh Berkus


> The issue with not allowing global spaces to overlap local ones is that
> we'd have to check every local list when creating a global account;
> that doesn't seem very easy to do.  On the flip side, allowing
> duplicates between global and local would remove the need to check local
> lists when creating global accounts, but would add complexity and could
> lead to odd semantics when there is a duplicate.

On the other hand, keep in mind that creating a global account can be
slow.  For anyone who has a huge multi-tenant setup with 200 database
each with their own local users, creating a new global account will be
an event which occurs once or twice a year.  Just so that we don't pay
the same check cost for people who don't use local accounts.

> If you could help me work out the semantics and the high-level issues,
> I'd love to spend time on this for 9.3...

Syntax seems simple: CREATE LOCAL ROLE ...

For that matter, let's keep other things simple:

1. local roles can inherit only from other local roles
2. global roles can inherit only from other global roles
3. only a global role can be a database owner


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Alexander Korotkov

On Wed, May 23, 2012 at 1:26 AM, Bruce Momjian  wrote:

> On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote:
> > "Improve GiST box and point index performance by producing better trees
> with
> > less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas,
> Kevin
> > Grittner)"
> > Is this note about following two commits?
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> > 7f3bd86843e5aad84585a57d3f6b80db3c609916
> > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> > d50e1251946a6e59092f0a84fc903532eb599a4f
> > These improvements influence not only boxes and points but all
> geometrical
> > datatypes.
>
> OK, new wording:
>
>Improve GiST geometric type index performance by producing better
>trees with less memory allocation overhead (Alexander Korotkov)

Thanks!

Also, I've some notes about removing reviewers.
"Improve GiST index build times (Alexander Korotkov)"
I think Heikki Linnakangas should be also listed as author of that patch
because he didn't only review and commit, but actually put his hands on
code.

Isn't my authorship of this patch lost now?
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=80da9e68fdd70b796b3a7de3821589513596c0f7
I think earlier this patch was taken into account in entry "Add support for
range data types". Probably, we need separate entry for this patch?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Simon Riggs

On 22 May 2012 18:41, Tom Lane  wrote:

> It'd be better to work on index-organized tables

My earlier analysis showed that IOTs are essentially the same thing as
block-level indexes, referred to as GITs by Heikki. (Robert referred
to these as Lossy Indexes recently, which was not the case - that
aspect was exactly the reason for rejection previously, so we should
not retread that path - indexes can operate at block level without
being lossy).

The number of index pointers is identical in each case, so IOTs are
not any more efficient in terms of space usage or I/O.

IOTs are much more difficult to implement, so I can't see any reason
to work on them. For example, having heap rows migrate on a block
split will cause havoc with our index implementation. We haven't
worked out how to re-join blocks that have split while maintaining
concurrency, so IOTs would require some pretty drastic repacking with
a severe lock type. Please lets avoid IOTs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Bruce Momjian

On Wed, May 16, 2012 at 10:49:25PM +0300, Heikki Linnakangas wrote:
> On 16.05.2012 22:38, Jeff Janes wrote:
> >For item:
> >Improve COPY performance by adding tuples to the heap in batches
> >(Heikki Linnakangas)
> >
> >I think we should point out that the batching only applies for COPY
> >into unindexed tables.  Nice as the feature is, that is pretty big
> >limitation not to mention.
> 
> No, it applies to indexed tables too. However, if there are indexes
> on the table, the overhead of updating the indexes will probably
> make any speedup in the heap insertions look tiny in comparison.
> 
> The optimization doesn't apply if the table has BEFORE or INSTEAD OF
> triggers, or volatile DEFAULT expressions need to be evaluated.

So, I will assume the existing wording is fine.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Bruce Momjian

On Thu, May 10, 2012 at 10:22:58PM +0400, Alexander Korotkov wrote:
> "Improve GiST box and point index performance by producing better trees with
> less memory allocation overhead (Alexander Korotkov, Heikki Linnakangas, Kevin
> Grittner)"
> Is this note about following two commits?
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> 7f3bd86843e5aad84585a57d3f6b80db3c609916
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
> d50e1251946a6e59092f0a84fc903532eb599a4f
> These improvements influence not only boxes and points but all geometrical
> datatypes.

OK, new wording:

Improve GiST geometric type index performance by producing better
trees with less memory allocation overhead (Alexander Korotkov)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Simon Riggs

On 22 May 2012 19:01, Robert Haas  wrote:
> On Tue, May 22, 2012 at 1:41 PM, Tom Lane  wrote:
>> Jeff Janes  writes:
>>> Now that there are index only scans, there is a use case for having a
>>> composite index which has the primary key or a unique key as the
>>> prefix column(s) but with extra columns after that.  Currently you
>>> would also need another index with exactly the primary/unique key,
>>> which seems like a waste of storage and maintenance.
>>
>>> Should there be a way to declare a "unique" index with the unique
>>> property applying to a prefix of the indexed columns/expression?  And
>>> having that, a way to turn that prefix into a primary key constraint?
>>
>>> Of course this is easier said then done, but is there some reason for
>>> it not to be a to-do item?
>>
>> Um ... other than it being ugly as sin?  I can't say that I can get
>> excited about this concept.  It'd be better to work on index-organized
>> tables, which is really more or less what you're wishing for here.
>> Duplicating most of a table into an index is always going to be a loser
>> in the end because of the redundant storage.
>
> An index on pgbench_accounts (aid, abalance) is the same size as an
> index on pgbench_accounts (aid), but even if it were larger, there's
> no theoretical reason it couldn't have enough utility to justify its
> existence.

Agreed

>  A bigger problem is that creating such an index turns all
> of pgbench's write traffic from HOT updates into non-HOT updates,
> which means this is probably only going to be a win if the write
> volume is miniscule.

Not sure whether you see that as an argument against the proposal.
This argument applies to any index. In particular covered indexes are
specifically encouraged by index only scans, so is not a reason to
avoid implementing the feature as Jeff describes.

The main reason for the feature as described by Jeff is that it avoids
having 2 indexes when only one is required. In 9.2, with index only
scans, Jeff showed elsewhere that we can get an amazing speed up by
having a covered index. However, what Jeff is noticing is that he
needs 2 indexes on the table: 1 PK on (aid) and another index on (aid,
abalance). The first index can be avoided altogether, allowing a good
improvement in cache efficiency and general performance.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Florian Pflug

On May22, 2012, at 22:35 , Stephen Frost wrote:
> * Josh Berkus (j...@agliodbs.com) wrote:
>> The local role is preferred, the same way we allow objects in the local
>> schema to overshadow objects in the global schema.
> 
> I would think we'd want the exact opposite.  I don't want my global
> 'postgres' user to be overwritten by some local one that the admin of
> this particular DB created..  

From a security POV, yup, you'd want global roles to take precedence.
But OTOH, you wouldn't want your application to suddenly break because
someone created a global role which shadows the local role you've created
a year ago.

Hm… thinking about this… No matter which takes precedence, we'd need some
way to explicitly specify global or local scope anyway. And we'd have to
prevent roles from being named in a way that conflicts with whatever
explicit specification we come up with, even if that causes pain for some
unlucky existing users. Avoiding these conflicts entirely is going to be
impossible, I fear, since we don't currently restrict role names in any way
AFAIK, and we store them in GUCs without any quoting. 

So maybe we should just pick some qualification like prefixing local roles
with 'local.', forbid global roles from starting with 'local.', and be done
with it? Not the most elegant solution maybe, but it avoids surprises...

> The issue with not allowing global spaces to overlap local ones is that
> we'd have to check every local list when creating a global account;
> that doesn't seem very easy to do. 

"Not very easy" is quite an understatement, I fear. "Very nearly impossible"
is more like it IMHO.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Bruce Momjian

On Wed, May 16, 2012 at 05:30:27PM -0400, Bruce Momjian wrote:
> 
> I will make the adjustments outlined below as soon as I can.

Done and committed.

---

> 
> On Sun, May 13, 2012 at 12:37:52AM -0400, Robert Haas wrote:
> > On Sat, May 12, 2012 at 8:11 PM, Euler Taveira  wrote:
> > > On 12-05-2012 10:27, Bruce Momjian wrote:
> > >> How many names on a single item is ideal?  The activity of reviewers and
> > >> their names on commit messages has greatly expanded the number of
> > >> potential names per item.
> > >>
> > > Main authors only. Reviewers should be mentioned only in the commit log. 
> > > If I
> > > coded a feature and Bruce got the idea worked in another patch (that is 
> > > better
> > > then mine), I think only Bruce should be credited in release notes (but I
> > > could be mentioned in the commit log as the feature designer). However, 
> > > if I
> > > posted a patch and Robert improved that patch using only 30% of my work, I
> > > should be credited (as coauthor) because he used a considerable part of 
> > > my work.
> > 
> > Completely agreed.  If we're going to include names in the release
> > notes, I agree that this is the way to do it, and I think it's what we
> > have done in prior releases.
> > 
> > I tend to err on the side of crediting people in the commit message
> > (of course, occasionally I forget someone who should have been
> > included), but I also try to make it clear by the phrasing whose code
> > got included and who contributed in some other way - e.g. by reporting
> > the problem, coming up with the original idea, or reviewing.  I do
> > this in part because I assumed that we'd use that as the criteria for
> > including names in the release notes, as we have done in prior
> > releases.  So if I write:
> > 
> > Euler Taveira, reviewed by Bruce Momjian, substantially rewritten by me
> > 
> > ...then I expect that to turn up in the release notes as (Euler
> > Taveira, Robert Haas).  If I write:
> > 
> > Euler Taveira, reviewed by Bruce Momjian, with minor cleanup by me
> > 
> > ...then I expect that to turn up as (Euler Taveira).  And if I write
> > something like:
> > 
> > Inspired by a patch from Euler Taveira.  Review (in earlier versions)
> > by Bruce Momjian.
> > 
> > ...then I expect that to turn up as (Robert Haas) or (Robert Haas,
> > Euler Taveira).
> > 
> > In doubtful cases, I think it's generally appropriate to err on the
> > side of crediting the person who was the original driving force behind
> > the patch, and also to err on the side of not crediting the committer.
> >  But if the committer chopped up the patch and committed something
> > significantly different from the original, then they should be
> > credited - or blamed - for the result.
> > 
> > -- 
> > Robert Haas
> > EnterpriseDB: http://www.enterprisedb.com
> > The Enterprise PostgreSQL Company
> 
> -- 
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
> 
>   + It's impossible for everything to be true. +
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 18:56, Josh Berkus  wrote:

> I'm not arguing that we don't have users who would like interdatabase
> queries, especially when they port applications from MySQL or MSSQL.  We
> have a lot of such users.

Lots and lots, yes.

> However, we *also* have a lot of users who
> would like to treat separate databases as virtual private instances of
> Postgres, and there's no way to satisfy both goals. We have to choose
> one route or the other.

That's only true if you try to satisfy both goals at once, which I'm
not suggesting. So I believe that proposition to be false.

However, given sufficient people speaking against it, I'll leave this idea.

Though I'd suggest that people on this thread spend a little quality
time with FDWs. It's a great direction but there's a long way to go
yet. Sorry to Laurenz, who's done a great job so far on the Oracle
FDW.

> I personally see the isolation case as the more necessary because there
> are several workarounds for the "inter-database queries" issue

I also want that, per my original post.

> An alternative idea -- and one which could be deployed a lot faster --
> is to come up with a tool which makes it easy to migrate an entire
> database into a separate schema or set of schemas in an existing
> database.   And improvements to manage schema visility/path better, I
> suppose.

Yes, it is possible to improve things there also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Ants Aasma

On Tue, May 22, 2012 at 8:28 PM, Robert Haas  wrote:
> Totally agreed.  We're not the first people to think of this, either:
> CLOCK and GLOCK have been extensively studied and found to be almost
> as good as LRU in selecting good victim pages, but with less
> contention.  That's why people are using them.
>
> Here's a paper that defines GLOCK to be the algorithm we use (page 2,
> second column, second paragraph from the bottom), and furthermore
> mentions PostgreSQL (top of page 3):
>
> http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf

Interesting paper, they make the whole buffer management lock free.
Getting rid of not only the BufFreelistLock but also BufMappingLocks
and buffer header spinlocks. Now AFAIK BufMappingLocks and buffer
header aren't really a big issue for us, and atleast the latter looks
like turning it lock-free would entail lots of pretty hairy and
bug-prone code. On the other hand, making the clock sweep lock-free
looks relatively easy.

As far as I can see there is no reason why nextVictimBuffer couldn't
be read, incremented and atomically cmpxchg'ed to let other continue
before checking for the usage count. Likewise completePasses and
numBufferAllocs can easily be atomically incremented, bgwriter
shouldn't mind if the values are slightly out of date. The free list
itself is a bit trickier, but if it's still necessary/useful then
SC->firstFreeBuffer and buf->freeNext are in effect a linked-list
stack, there should plenty of tested lock free algorithms floating
around for that. (btw. lastFreeBuffer looks like dead code, is that
correct?)

As for better buffer management algorithms, have you read about
CLOCK-Pro? [1] It looks like it's an adaptive variant of LIRS, the
algorithm the MySQL uses (or atleast used some time ago). Looks like
Linux kernel devs also at least thought about implementing it [2] [3]
(hard to tell exactly, their docs are pretty chaotic compared pg).
According to [4], LIRS is almost as good as or better than LRU, by
extension I'd expect CLOCK-Pro to be better than GLOCK. It still has a
single clock mechanism, so better replacement policies won't help a
whole lot with lock contention on buffer allocation.

[1] http://www.cse.ohio-state.edu/~fchen/paper/papers/usenix05.pdf
[2] http://linux-mm.org/ClockProApproximation
[3] http://linux-mm.org/PageReplacementDesign
[4] http://www.almaden.ibm.com/cs/people/dmodha/ARC.pdf

Cheers,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
> The local role is preferred, the same way we allow objects in the local
> schema to overshadow objects in the global schema.

I would think we'd want the exact opposite.  I don't want my global
'postgres' user to be overwritten by some local one that the admin of
this particular DB created..  

> The feature wouldn't be useful if we didn't allow conflicts between two
> local role names.  However, we could prohibit conflicts between a local
> role name and a global role name if it made the feature considerably
> easier.  Users would find workarounds which weren't too arduous.

Sorry, I was meaning between global space and local space.  Clearly we
must allow and handle cleanly overlaps between local spaces.

The issue with not allowing global spaces to overlap local ones is that
we'd have to check every local list when creating a global account;
that doesn't seem very easy to do.  On the flip side, allowing
duplicates between global and local would remove the need to check local
lists when creating global accounts, but would add complexity and could
lead to odd semantics when there is a duplicate.

> 1. create a new local role
> 2. reassign all the objects belonging to the global role to the local role
> 3. drop the global role
> 4. rename the local role

Right, that seems like it would work fine.

> It'd be somewhat of a PITA, but I suspect that most people using the
> "local roles" feature would recreate their databases from scratch
> anyway.  And we could offer some sample scripts for the above on the
> wiki and elsewhere.  Obviously, a more elegant migration command would
> be ideal, but that could wait for the following PG release; we usually
> follow the "make things possible first, and easy later" plan anyway.

Sure.

> Given that I'd love to have this feature, I'm trying to pare down its
> requirements to a managable size.  Trying to do everything at once will
> only result in the feature stalling until 10.5.

If you could help me work out the semantics and the high-level issues,
I'd love to spend time on this for 9.3...

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Schema version management

2012-05-22 Thread Daniel Farina

On Mon, May 21, 2012 at 5:25 PM, Joel Jacobson  wrote:
> If one want to reuse the splitting to files-code of the directory
> format, maybe the existing option -F d could be tweaked to output in
> both a a machine-readable format (current way), and also a
> human-friendly tree of files and content (like suggested by my patch).
>
> I wonder what the option would be called then, having two chars
> options is not an option I guess, maybe -F t for "tree" instead of
> "directory", as the -F d option only dumps to a single directory and
> not a tree-structure?

Is there a reason why the current directory format could not be
adjusted to become more human-readable friendly for mechanical
reasons?  I realize there is a backwards compatibility problem, but it
may be better than bloating a new option.

Andrew's approach of reading the TOC also be good...as so pg_dump can
avoid serving the additional master of schema versioning and
development usability in addition to dumping.  The TOC is the closest
thing we have to the library-ification of pg_dump in the near-term.
But I don't see how making the directory output format more
human-friendly could be seen as a bad thing overall, except in the
notable axis of implementation complexity.  Silly issues like naming
files on different platforms, case sensitivity, and file length
restrictions may rear their ugly head.

I think about this because in addition to the data types and operators
defined in the development process, there are often small tables that
need to be loaded with content and version controlled as well, rather
like userland-equivalents pg_enum entries.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Daniel Farina

On Tue, May 22, 2012 at 10:56 AM, Josh Berkus  wrote:
> I'm not arguing that we don't have users who would like interdatabase
> queries, especially when they port applications from MySQL or MSSQL.  We
> have a lot of such users.  However, we *also* have a lot of users who
> would like to treat separate databases as virtual private instances of
> Postgres, and there's no way to satisfy both goals. We have to choose
> one route or the other.

I think the idea that a physical machine where catalogs are physically
(shared-everything) co-located is one that will not stand for long as
part of a useful contract between a user and the database.  I'd really
like to avoid an extra tier of functionality that exists only for
databases that happen to land on the same physical machine.

I think any inter-database feature should work identically between two
databases across a network as two machines on one machine/cluster.

Transparent optimizations to deal with the special case of physical
co-location are not contrary to that contract, but I don't have a
sense of how important those optimizations would be before getting a
lot of the usability issues figured out.  Right now, it seems to me
that getting interdatabase usability feeling better is already pretty
hard.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Vik Reykja

On Tue, May 22, 2012 at 1:36 PM, Simon Riggs  wrote:

> On 22 May 2012 18:24, Jeff Janes  wrote:
> > Now that there are index only scans, there is a use case for having a
> > composite index which has the primary key or a unique key as the
> > prefix column(s) but with extra columns after that.  Currently you
> > would also need another index with exactly the primary/unique key,
> > which seems like a waste of storage and maintenance.
> >
> > Should there be a way to declare a "unique" index with the unique
> > property applying to a prefix of the indexed columns/expression?  And
> > having that, a way to turn that prefix into a primary key constraint?
> >
> > Of course this is easier said then done, but is there some reason for
> > it not to be a to-do item?
>
> +1
>
> Very useful
>
>
Semi-private note to Simon: isn't this pretty much what I was advocating at
the London meetup last month?

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Jeff Janes

On Tue, May 22, 2012 at 10:41 AM, Tom Lane  wrote:
> Jeff Janes  writes:
>> Now that there are index only scans, there is a use case for having a
>> composite index which has the primary key or a unique key as the
>> prefix column(s) but with extra columns after that.  Currently you
>> would also need another index with exactly the primary/unique key,
>> which seems like a waste of storage and maintenance.
>
>> Should there be a way to declare a "unique" index with the unique
>> property applying to a prefix of the indexed columns/expression?  And
>> having that, a way to turn that prefix into a primary key constraint?
>
>> Of course this is easier said then done, but is there some reason for
>> it not to be a to-do item?
>
> Um ... other than it being ugly as sin?  I can't say that I can get
> excited about this concept.  It'd be better to work on index-organized
> tables, which is really more or less what you're wishing for here.

IOT would probably be a nice feature too, but adding one more
strategically chosen column to an index is quite different from adding
every column into the index.  At least in the general case.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: parameterized function scan

2012-05-22 Thread Robert Haas

On Fri, May 11, 2012 at 5:52 PM, Antonin Houska
 wrote:
> Hello,
> following this short discussion
> http://archives.postgresql.org/message-id/4f5aa202.9020...@gmail.com
> I gave it one more try and hacked the optimizer so that function can become
> an inner relation in NL join, parametrized with values from the outer
> relation.
>
> I tried to explain my thoughts in comments. Other than that:
>
> 1. I haven't tried to use SpecialJoinInfo to constrain join order. Even if
> the order matters in query like
> SELECT * from a, func(a.i)
> it's still inner join by nature. SpecialJoinInfo is used for INNER join
> rarely, but never stored in PlannerInfo. Doing so only for these lateral
> functions would be rather disruptive.
>
> 2. Simple SQL function (i.e. one that gets pulled-up into the main query) is
> a special case. The query that results from such a pull-up no longer
> contains any function (and thus is not affected by this patch) but such
> cases need to be newly taken into account and examined / tested (the patch
> unblocks them at parsing stage too).
>
> 3. There might be some open questions about SQL conformance.
>
> I've spent quite a while looking into the optimizer code and after all I was
> surprised that it didn't require that many changes. At least to make few
> simple examples work. Do I ignore any important fact(s) ?

This implementation looks different than I'd expect: I would have
thought that it would work by generating paths with param_info set to
the appropriate set of rels to provide the necessary values, rather
than inventing its own mechanism for forcing a nestloop.

Also, I think we will want something that implements the LATERAL()
syntax, rather than just removing the prohibition on lateral
references.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread David E. Wheeler

On May 22, 2012, at 9:56 AM, Tom Lane wrote:

> It seems probably workable given that we expect the pending list to be
> of fairly constrained size.  However, the commit message referenced
> upthread also muttered darkly about GIN's partial match logic not working
> in amgettuple.  I do not recall the details of that issue, but unless we
> can solve that one too, there's not much use in fixing this one.

Well, what about a GiST operator family/class for arrays?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 1:41 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> Now that there are index only scans, there is a use case for having a
>> composite index which has the primary key or a unique key as the
>> prefix column(s) but with extra columns after that.  Currently you
>> would also need another index with exactly the primary/unique key,
>> which seems like a waste of storage and maintenance.
>
>> Should there be a way to declare a "unique" index with the unique
>> property applying to a prefix of the indexed columns/expression?  And
>> having that, a way to turn that prefix into a primary key constraint?
>
>> Of course this is easier said then done, but is there some reason for
>> it not to be a to-do item?
>
> Um ... other than it being ugly as sin?  I can't say that I can get
> excited about this concept.  It'd be better to work on index-organized
> tables, which is really more or less what you're wishing for here.
> Duplicating most of a table into an index is always going to be a loser
> in the end because of the redundant storage.

An index on pgbench_accounts (aid, abalance) is the same size as an
index on pgbench_accounts (aid), but even if it were larger, there's
no theoretical reason it couldn't have enough utility to justify its
existence.   A bigger problem is that creating such an index turns all
of pgbench's write traffic from HOT updates into non-HOT updates,
which means this is probably only going to be a win if the write
volume is miniscule.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Josh Berkus


> Why is it OK to allow somebody to access multiple schema in one query,
> but not multiple databases? Are you arguing that schemas are also
> broken?

Because the purpose of a database is to be a Catalog, i.e. an isolated
container, which is not the purpose of schemas.  To the extent to which
we can make the isolation a reality (instead of "lossy isolation" the
way it is now), we can enable many multitenant hosting designs which
aren't currently possible.  However, pursuing interdatabase queries at
the same time we try to add new isolation features is a doomed effort.

For example, if we created local database users (see other thread), then
what would happen if a local user tries to execute a cross-database
query?  If we enable physical migration of a database to a new Postgres
instance, what happens to standing multi-database views?  If
interdatabase queries are allowed, how do I, as a hosting operator, make
sure that users can't even see the other databases on the system?

> I see no failure by design. I see an idea for greater ease of use
> being discussed.

You can't attempt mutually contradictory requirements and expect to
succeed, or to improve ease of use.  You can't ride two horses,
especially if they're going in opposite directions.

> Personally, I have long recommended that people use schemas. But
> people do use databases and when they do they are pretty much screwed.
> I brought this up as a way of improving our ease of use.

I'm not arguing that we don't have users who would like interdatabase
queries, especially when they port applications from MySQL or MSSQL.  We
have a lot of such users.  However, we *also* have a lot of users who
would like to treat separate databases as virtual private instances of
Postgres, and there's no way to satisfy both goals. We have to choose
one route or the other.

I personally see the isolation case as the more necessary because there
are several workarounds for the "inter-database queries" issue, but
nothing for the "multitenant catalog" case.  Also, treating databases as
catalogs is more consistent with our historical approach, and will thus
break less backwards compatibility.

Maybe interdatabase queries are more useful/desired than catalog
features.  If that's the case, then we need to pursue them and abandon
efforts like per-database users. But we have to make a choice.

An alternative idea -- and one which could be deployed a lot faster --
is to come up with a tool which makes it easy to migrate an entire
database into a separate schema or set of schemas in an existing
database.   And improvements to manage schema visility/path better, I
suppose.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with error response message

2012-05-22 Thread Simon Riggs

On 22 May 2012 12:59, Tatsuo Ishii  wrote:

> Also I wonder why "conflict with recovery" generates same error code
> as serialization error. For me it seems not very consistent
> choice. IMO, we should assign different error code for at least brings
> totally different effect to frontend: aborting transactions or
> sessions.

Some conflicts cause ERROR, some cause FATAL. So there is not a
distinct difference.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Daniel Farina

On Tue, May 22, 2012 at 10:43 AM, Simon Riggs  wrote:
> On 22 May 2012 18:35, Josh Berkus  wrote:
>>
>>> If I have a customer with 1 database per user, how do they run a query
>>> against 100 user tables? It would require 100 connections to the
>>> database. Doing that would require roughly x100 the planning time and
>>> x100 the connection delay. Larger SQL statements pass their results
>>> between executor steps using libpq rather than direct calls.
>>
>> Why is this hypothetical customer using separate databases?  This really
>> seems like a case of "doctor, it hurts when I do this".
>
> Databases are great for separating things, but sometimes you want to
> un-separate them in a practical way.

In my experience, these un-separations are (thankfully) relieved of
the requirement of consistency between databases, and so the contract
is much more favorable.

The planning time problem is quite hard.

However, I think the connection-delay one is not as hard a one to
answer: I think multiplexed protocols are going to become the norm in
the near future (they have been a pretty uncontested part of the SPDY
protocol, for example, after flow control was added) and have a number
of useful properties, and it may be time to consider how we're going
to divorce the notion of one socket implies exactly one backend.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 1:27 PM, Simon Riggs  wrote:
> Ack, part from the bit about OIDs no longer being unique. That might
> be an upgrade issue but its obviously something we wouldn't allow if
> we did that.

And how exactly are you going to disallow that?  We currently enforce
the uniqueness of OIDs within a database by putting a unique index on
the relevant system catalog columns, but that clearly won't work as a
means of guaranteeing cross-database uniqueness.  Unless of course you
put all the catalog entries from all the databases into a single set
of catalogs; but then they're pretty much the same as the schemas we
already have.

> I'm not sure I see changing the caches as being massive surgery.
> Perhaps we could just bypass them altogether.

You're entitled to your opinion on this one, but we have those caches
for a very good reason: the system runs about 20,000 times slower
without them.

>> So I'd rather see us put the effort into pgsql_fdw, which, as Florian
>> says, will also let us talk to a completely separate server.  If
>> you've got multiple databases in the same cluster and really need to
>> be doing queries across all of them, that's what schemas are
>> supposedly for.  Now I know that people feel that doesn't work as well
>> as it needs to, but there again I think it would be easier to fix
>> schemas than to make cross-database queries work.
>
> We're a very long way from making that work well. IMHO easily much
> further than direct access.
>
> If I have a customer with 1 database per user, how do they run a query
> against 100 user tables? It would require 100 connections to the
> database. Doing that would require roughly x100 the planning time and
> x100 the connection delay. Larger SQL statements pass their results
> between executor steps using libpq rather than direct calls.
>
> I find it hard to believe FDWs will ever work sufficiently well to fix
> those problems.

If you have got a customer with one database per user, and yet you
want to run queries across all those tables, then you're using the
database system for something for which it is not designed, and it's
probably not going to work very well.  That sounds like a use case for
schemas, or maybe better, some kind of row-level security we don't
have yet - but hopefully will in 9.3, since KaiGai intends to work on
it.  Databases are designed to act as a firewall, so that somebody
using one database isn't affected by what happen in some other
database.  Unfortunately, because of shared catalogs, that's not
completely true, but that's the idea, and if anything we need to
isolate them more, not less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 18:35, Josh Berkus  wrote:
>
>> If I have a customer with 1 database per user, how do they run a query
>> against 100 user tables? It would require 100 connections to the
>> database. Doing that would require roughly x100 the planning time and
>> x100 the connection delay. Larger SQL statements pass their results
>> between executor steps using libpq rather than direct calls.
>
> Why is this hypothetical customer using separate databases?  This really
> seems like a case of "doctor, it hurts when I do this".

Databases are great for separating things, but sometimes you want to
un-separate them in a practical way.

I'm surprised that you're so negative about an ease of use feature. I
had understood you cared about fixing problems experienced by our
developers.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Tom Lane

Jeff Janes  writes:
> Now that there are index only scans, there is a use case for having a
> composite index which has the primary key or a unique key as the
> prefix column(s) but with extra columns after that.  Currently you
> would also need another index with exactly the primary/unique key,
> which seems like a waste of storage and maintenance.

> Should there be a way to declare a "unique" index with the unique
> property applying to a prefix of the indexed columns/expression?  And
> having that, a way to turn that prefix into a primary key constraint?

> Of course this is easier said then done, but is there some reason for
> it not to be a to-do item?

Um ... other than it being ugly as sin?  I can't say that I can get
excited about this concept.  It'd be better to work on index-organized
tables, which is really more or less what you're wishing for here.
Duplicating most of a table into an index is always going to be a loser
in the end because of the redundant storage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Simon Riggs

On 22 May 2012 18:24, Jeff Janes  wrote:
> Now that there are index only scans, there is a use case for having a
> composite index which has the primary key or a unique key as the
> prefix column(s) but with extra columns after that.  Currently you
> would also need another index with exactly the primary/unique key,
> which seems like a waste of storage and maintenance.
>
> Should there be a way to declare a "unique" index with the unique
> property applying to a prefix of the indexed columns/expression?  And
> having that, a way to turn that prefix into a primary key constraint?
>
> Of course this is easier said then done, but is there some reason for
> it not to be a to-do item?

+1

Very useful

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Josh Berkus


> If I have a customer with 1 database per user, how do they run a query
> against 100 user tables? It would require 100 connections to the
> database. Doing that would require roughly x100 the planning time and
> x100 the connection delay. Larger SQL statements pass their results
> between executor steps using libpq rather than direct calls.

Why is this hypothetical customer using separate databases?  This really
seems like a case of "doctor, it hurts when I do this".

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 18:18, Josh Berkus  wrote:
>
>> 1. Ability to have a Role that can only access one Database
>>
>> 2. Allow user info to be dumped with a database, to make a db
>> completely self-consistent
>>
>> 3. Allow databases to be transportable
>>
>> 4. Allow users to access tables in >1 database easily, with appropriate 
>> rights.
>
> The last requirement seems completely contradictory to the other three.
>  Either we're trying to make databases even more isolated as
> multi-tenant Catalogs, or we're not.  Trying to do both at the same time
> is failure-by-design.

Why is it OK to allow somebody to access multiple schema in one query,
but not multiple databases? Are you arguing that schemas are also
broken?

I see no failure by design. I see an idea for greater ease of use
being discussed.

> Given that we offer schema as an alternative to multiple databases, and
> users are starting to get used to them, I think that requirement (4) is
> just a bad idea, and not worth pursuing,

Personally, I have long recommended that people use schemas. But
people do use databases and when they do they are pretty much screwed.
I brought this up as a way of improving our ease of use.

> except in the context of pgsql_fdw.

That is not a realistic option.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Josh Berkus

Stephen,

> Which is "preferred"
> when you do a 'grant select' or 'grant role'?  

The local role is preferred, the same way we allow objects in the local
schema to overshadow objects in the global schema.

> Or do we just disallow
> overlaps between per-DB roles and global roles?  If we don't allow
> duplicates, I suspect a lot of the other questions suddenly become a lot
> easier to deal with, but would that be too much of a restriction?

The feature wouldn't be useful if we didn't allow conflicts between two
local role names.  However, we could prohibit conflicts between a local
role name and a global role name if it made the feature considerably
easier.  Users would find workarounds which weren't too arduous.

> How
> would you handle migrating an existing global role to a per-database
> role?

Again, I think it would be OK not handling it.  i.e., the user needs to
do the following:

1. create a new local role
2. reassign all the objects belonging to the global role to the local role
3. drop the global role
4. rename the local role

It'd be somewhat of a PITA, but I suspect that most people using the
"local roles" feature would recreate their databases from scratch
anyway.  And we could offer some sample scripts for the above on the
wiki and elsewhere.  Obviously, a more elegant migration command would
be ideal, but that could wait for the following PG release; we usually
follow the "make things possible first, and easy later" plan anyway.

Given that I'd love to have this feature, I'm trying to pare down its
requirements to a managable size.  Trying to do everything at once will
only result in the feature stalling until 10.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 12:39 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> If we're going to throw our current algorithm over wholesale, I'd
>> rather use some approach that has been demonstrated to work well in
>> other systems.  Buffer eviction is a problem that's been around since
>> the 1970s, and our algorithm is just about that old.
>
> Um, if you're claiming that that code dates from Berkeley days, you're
> quite mistaken.

Note the code: the algorithm.  I believe that what we have implemented
is GCLOCK, which is very old.

In the interest of full disclosure, descriptions of exactly what
GCLOCK is seem to vary a bit from paper to paper, but at least some
papers describe the exact algorithm we use.

> We adopted the clock sweep in 2005, after trying a few
> other things whose performance was worse.  I've not seen any argument in
> this thread that suggests we should abandon clock sweep + usage counts
> entirely.  Rather, to me the issue is that we haven't completely gotten
> rid of the last vestiges of the old global freelist approach.

Well, I think that switching from one clock sweep to a clock sweep per
backend would be basically an abandonment of the current approach.
The results might be better or worse, but they'd surely be different.

> BTW, it might be worth pointing out something I was trying to explain
> to Amit at PGCon: the key reason that we went with usage counters rather
> than something like a global LRU chain is that in the fast path where
> ReadBuffer finds the requested block already in a buffer, it does not
> have to contend for any global data structure to update the buffer's
> usage information.  It just has to bump the usage count in the buffer's
> header.  The free list, and the contention for BufFreelistLock, only
> matter in the slow path where you're going to have to incur I/O anyway
> (or at least a visit to the kernel).  That seemed plenty good enough
> in 2005.  Our ambitions have now advanced further, so I'm on board with
> trying to reduce contention here too, but I think it would be a mistake
> to make the fast case any slower.

Totally agreed.  We're not the first people to think of this, either:
CLOCK and GLOCK have been extensively studied and found to be almost
as good as LRU in selecting good victim pages, but with less
contention.  That's why people are using them.

Here's a paper that defines GLOCK to be the algorithm we use (page 2,
second column, second paragraph from the bottom), and furthermore
mentions PostgreSQL (top of page 3):

http://staff.aist.go.jp/m.yui/publications/ICDE10_conf_full_409.pdf

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 14:05, Robert Haas  wrote:
> On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan  wrote:
>> That seems to be leaving aside the fact that we don't currently have any
>> notion of how to allow FDWs to write the foreign tables.
>>
>> What is more, isn't the postgres FDW about talking to any postgres source?
>> If so, does it have special smarts for when we are talking to ourselves? And
>> if it doesn't then it seems unlikely to be an acceptable substitute for
>> allowing talking direct to a sibling database.
>>
>> I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
>> adequately meet the case if we wanted to go that way.
>
> Well, I don't think anyone is claiming that FDWs as they exist today
> solve all of the problems in this area.  But I think that improving
> FDWs is a more promising line of attack than trying to make backends
> talk to multiple databases.  Doing the latter will require massive
> surgery on the relcache, the catcache, most of the secondary catalog
> caches, the ProcArray, and every portion of the backend that thinks an
> OID uniquely identifies an SQL object.  Basically, they'd all need
> database OID as an additional key field, which is undesirable for
> performance reasons even if there were no issue of code churn.

Ack, part from the bit about OIDs no longer being unique. That might
be an upgrade issue but its obviously something we wouldn't allow if
we did that.

I'm not sure I see changing the caches as being massive surgery.
Perhaps we could just bypass them altogether.

> So I'd rather see us put the effort into pgsql_fdw, which, as Florian
> says, will also let us talk to a completely separate server.  If
> you've got multiple databases in the same cluster and really need to
> be doing queries across all of them, that's what schemas are
> supposedly for.  Now I know that people feel that doesn't work as well
> as it needs to, but there again I think it would be easier to fix
> schemas than to make cross-database queries work.

We're a very long way from making that work well. IMHO easily much
further than direct access.

If I have a customer with 1 database per user, how do they run a query
against 100 user tables? It would require 100 connections to the
database. Doing that would require roughly x100 the planning time and
x100 the connection delay. Larger SQL statements pass their results
between executor steps using libpq rather than direct calls.

I find it hard to believe FDWs will ever work sufficiently well to fix
those problems.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add primary key/unique constraint using prefix columns of an index

2012-05-22 Thread Jeff Janes

Now that there are index only scans, there is a use case for having a
composite index which has the primary key or a unique key as the
prefix column(s) but with extra columns after that.  Currently you
would also need another index with exactly the primary/unique key,
which seems like a waste of storage and maintenance.

Should there be a way to declare a "unique" index with the unique
property applying to a prefix of the indexed columns/expression?  And
having that, a way to turn that prefix into a primary key constraint?

Of course this is easier said then done, but is there some reason for
it not to be a to-do item?

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Josh Berkus


> 1. Ability to have a Role that can only access one Database
> 
> 2. Allow user info to be dumped with a database, to make a db
> completely self-consistent
> 
> 3. Allow databases to be transportable
> 
> 4. Allow users to access tables in >1 database easily, with appropriate 
> rights.

The last requirement seems completely contradictory to the other three.
 Either we're trying to make databases even more isolated as
multi-tenant Catalogs, or we're not.  Trying to do both at the same time
is failure-by-design.

Given that we offer schema as an alternative to multiple databases, and
users are starting to get used to them, I think that requirement (4) is
just a bad idea, and not worth pursuing, except in the context of pgsql_fdw.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How could we make it simple to access the log as a table?

2012-05-22 Thread David Fetter

On Mon, May 21, 2012 at 05:53:55PM -0700, Josh Berkus wrote:
> Hackers,
> 
> There's a lot of great information in the postgres logs.  While we
> eventually want to get more sophisticated about providing users with
> status and history information, for 9.3 it would be really nice to
> just offer the current logs in system view format.
> 
> Certainly this can be done ad-hoc using CSV format and csv_fdw.
> However, such setups are fragile due to log rotation and other
> issues.  It seems like we could come up with a better way.  Ideas?

Would it help to automate this stuff with that CSV format?  The trick,
as I see it, is to make those on-disk log files be partitions of a
table.

Stephen Frost's point is well-taken, but I'm not sure we need to make
it a blocker for this.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap metapages

2012-05-22 Thread Simon Riggs

On 22 May 2012 13:52, Robert Haas  wrote:

> It seems pretty clear to me that making pg_upgrade responsible for
> emptying block zero is a non-starter.  But I don't think that's a
> reason to throw out the design; I think it's a problem we can work
> around.

I like your design better as well *if* you can explain how we can get
to it. My proposal was a practical alternative that would allow the
idea to proceed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread Tom Lane

Robert Haas  writes:
> On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
>  wrote:
>> On May 13, 2012, at 3:45 PM, Robert Haas wrote:
>>> It seems like maybe we could work around this by remembering the
>>> contents of the pending list throughout the scan.  Every time we hit a
>>> TID while scanning the main index, we check whether we already
>>> returned it from the pending list; if so, we skip it, but if not, we
>>> return it.

>> Should this go onto the To-Do list, then?

> If someone other than me can confirm that it's not a stupid approach,
> I would say yes.

It seems probably workable given that we expect the pending list to be
of fairly constrained size.  However, the commit message referenced
upthread also muttered darkly about GIN's partial match logic not working
in amgettuple.  I do not recall the details of that issue, but unless we
can solve that one too, there's not much use in fixing this one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 12:28 PM, David E. Wheeler
 wrote:
> On May 13, 2012, at 3:45 PM, Robert Haas wrote:
>
>> It seems like maybe we could work around this by remembering the
>> contents of the pending list throughout the scan.  Every time we hit a
>> TID while scanning the main index, we check whether we already
>> returned it from the pending list; if so, we skip it, but if not, we
>> return it.
>
> Should this go onto the To-Do list, then?

If someone other than me can confirm that it's not a stupid approach,
I would say yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Tom Lane

Robert Haas  writes:
> If we're going to throw our current algorithm over wholesale, I'd
> rather use some approach that has been demonstrated to work well in
> other systems.  Buffer eviction is a problem that's been around since
> the 1970s, and our algorithm is just about that old.

Um, if you're claiming that that code dates from Berkeley days, you're
quite mistaken.  We adopted the clock sweep in 2005, after trying a few
other things whose performance was worse.  I've not seen any argument in
this thread that suggests we should abandon clock sweep + usage counts
entirely.  Rather, to me the issue is that we haven't completely gotten
rid of the last vestiges of the old global freelist approach.

BTW, it might be worth pointing out something I was trying to explain
to Amit at PGCon: the key reason that we went with usage counters rather
than something like a global LRU chain is that in the fast path where
ReadBuffer finds the requested block already in a buffer, it does not
have to contend for any global data structure to update the buffer's
usage information.  It just has to bump the usage count in the buffer's
header.  The free list, and the contention for BufFreelistLock, only
matter in the slow path where you're going to have to incur I/O anyway
(or at least a visit to the kernel).  That seemed plenty good enough
in 2005.  Our ambitions have now advanced further, so I'm on board with
trying to reduce contention here too, but I think it would be a mistake
to make the fast case any slower.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-22 Thread Jeff Janes

On Mon, May 21, 2012 at 2:29 PM, Merlin Moncure  wrote:
> On Mon, May 21, 2012 at 4:17 PM, Jeff Janes  wrote:
>>
>> For vaguely real life, take your example of pgbench -i -s200 -F 50,
>> and I have 2Gig RAM, which seems to be the same as you do.
>>
>> With select only work load (pgbench -S -M prepared -T 30), I get
>>
>> tps = 193
>>
>> But now enable index-only scans:
>>
>> psql -c "create index on pgbench_accounts(aid, abalance);"
>>
>> and it goes up to.
>>
>> tps = 10137
>
> Right -- the main driver here is that your index fits neatly in ram
> and the heap does not -- so you're effectively measuring the
> difference between a buffered and non-buffered access.  That's highly
> contrived as you noted and unlikely to come up all *that* often in the
> real world.

I don't think this one is highly contrived.  With the index being 10
fold smaller than the table, there is plenty of window for one to fit
in RAM and the other one not to in a variety of real world situations.
 (Although I only get 10 fold window because of -F 50.  I don't why
Josh had that big of a different in his original, unless he also used
a nondefault fill factor setting.  But of course many real world
tables will be wider than pgbench_accounts is.)

The highly contrived example useful for dissecting the implementation
would be to do:

set enable_seqscan=off;
set enable_indexonlyscan=off;
select count(*) from pgbench_accounts where aid is not null;

The key that I keep forgetting is the "where aid is not null'.
Without that it uses the full table scan, even with enable_seqscan
off, rather than doing an ordinary index scan.

> Generally though the real world wins (although the gains will be
> generally less spectacular) are heavily i/o bound queries where the
> indexed subset of data you want is nicely packed and the (non
> clustered) heap records are all over the place.  By skipping the semi
> random heap lookups you can see enormous speedups.  I figure 50-90%
> improvement would be the norm there, but this is against queries that
> are taking forever, being i/o bound.
>
> See here: 
> http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
> for a 'in the wild' gripe about about not having index scans.

But without scripts to recreate the data with the right selectivities
and correlations, and to generate a long stream of appropriate query
parameterizations so that they don't become cached, that is just a
gripe and not an example.

I tried to reproduce the problem as stated, and couldn't make IOS be
useful because I couldn't make it be slow even without them.
Presumably I'm doing something wrong, but how could I tell what?  Have
we heard back on whether IOS was tried and proved useful to the
originator of that thread?

If you want an example where neither index nor table fit in memory,
then just bump up the scale to 2000---and get a machine with only 2G
of memory if you don't already have one :)

With the extra index in place:

with enable_indexonlyscan=off
tps = 155.1
with enable_indexonlyscan=on
tps = 453.7

It seems like that should have only doubled, I'm not sure why it did
more than double.  Maybe the index became better cached when the table
stopped competing with it for buffers.

If you leave the select-only world and go back to doing updates, then
that extra index is doing to hurt you somewhat, but if the dominant
bottleneck is rpm of your WAL drive, it might not be noticeable.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exclusion Constraints on Arrays?

2012-05-22 Thread David E. Wheeler

On May 13, 2012, at 3:45 PM, Robert Haas wrote:

> It seems like maybe we could work around this by remembering the
> contents of the pending list throughout the scan.  Every time we hit a
> TID while scanning the main index, we check whether we already
> returned it from the pending list; if so, we skip it, but if not, we
> return it.

Should this go onto the To-Do list, then?

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Florian Pflug

On May22, 2012, at 18:03 , Thom Brown wrote:
> On 22 May 2012 16:57, Florian Pflug  wrote:
>> On May22, 2012, at 16:09 , Tom Lane wrote:
>>> Thom Brown  writes:
 Conflicts would occur where localrolename matches an existing local
 role name within the same database, or a global role name, but not a
 local role name within another database.  The problem with this,
 however, is that creating global roles would need conflict checks
 against local roles in every database, unless a manifest of all local
 roles were registered globally.
>>> 
>>> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
>>> want local and global roles to have nonconflicting OIDs, else it's
>>> necessary to carry around an indication of which type each role is;
>>> which would be more or less a show-stopper in terms of the number of
>>> catalogs and internal APIs affected.  But I don't currently see any
>>> nice way to guarantee that if each database has a private table of
>>> local roles.
>> 
>> Maybe we could simply make all global role's OIDs even, and all local ones
>> odd, or something like that.
> 
> Wouldn't that instantly make all previous versions of database
> clusters un-upgradable?

Only if pg_upgrade needs to preserve the OIDs of roles. I kinda hoped it
wouldn't, because role OIDs aren't usually stored in non-system tables.

Hm… thinking about this further, it'd actually be sufficient for all newly
allocated role OIDs to follow the odd/even rule, if we additionally check
for conflicts with existing global role OIDs when allocating the OID of a new
local role. Which is much, much easier than checking for conflicts when
allocating a global OIDs, because for that you'd have to check against the
local role OIDs within *all* databases, not just against one shared table.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 12:11 PM, Tom Lane  wrote:
>> With respect to the control logic for the background writer, one idea
>> I had was to get rid of the idea that the background writer's job is
>> to write in advance of the strategy point.  Instead, every time the
>> clock sweep passes over a dirty buffer that is otherwise evictable, we
>> add it to a queue of things that the bgwriter should clean.  Those
>> buffers, once cleaned, go on the free list.  Maybe some variant of
>> that could work with your idea.
>
> Both ends of that imply centralized data structures that could become
> contention bottlenecks, so it doesn't sound tremendously appealing to
> me.  Maybe it can work as long as nobody has to lock the lists for long,
> but I'd rather think of approaches with no central point of contention.

If we're going to throw our current algorithm over wholesale, I'd
rather use some approach that has been demonstrated to work well in
other systems.  Buffer eviction is a problem that's been around since
the 1970s, and our algorithm is just about that old.  I realize that
there are (legitimate) concerns about what might be patented, but
hopefully that doesn't mean we're not allowed to consult the
literature in any way.  If that were the case, we wouldn't have SSI.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Tom Lane

Thom Brown  writes:
> On 22 May 2012 16:57, Florian Pflug  wrote:
>> Maybe we could simply make all global role's OIDs even, and all local ones
>> odd, or something like that.

> Wouldn't that instantly make all previous versions of database
> clusters un-upgradable?

IIRC, pg_upgrade doesn't need to force role OIDs to be the same in the
new cluster, so we could get away with this trick for the specific case
of roles.  It wouldn't work for introducing local/global versions of
some other types of objects though.

Another objection is that it wouldn't scale up nicely to multiple levels
of catalog hierarchy.  But maybe local/global is enough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers

2012-05-22 Thread chinnaobi


Dear Robert,

Thank you very much for the reply. 

You mean when the primary which is going to switch its role to standby might
not have sent all the WAL records to the standby and If it is switched to
standby it has more WAL records than the standby which is now serves as
primary. Is it ??

It is actually the standby server which has to be restored from archive when
it is switching to primary right .. Not the primary which is switching to
standby ??

Regards,
Reddy.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637p5709495.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Florian Pflug

On May22, 2012, at 18:00 , Susanne Ebrecht wrote:
> Am 22.05.2012 17:42, schrieb Tom Lane:
>> Encoding yes, but since 9.1 we have pretty fine-grained control of
>> collation.  So I think this argument is a lot weaker than it used
>> to be.  It would only really apply if you have one of the corner
>> cases where utf8 doesn't work for you.
> 
> Yeah it got better - but it isn't perfect yet.

Still, the whole reason that the encoding is a per-database property is
that we don't tag each string with its encoding. To support cross-database
queries, we'd have to do that, and could then just as well make the encoding
a per-column property, or at least so I think.

> Maybe I am blind or 9.1 documentation has a bug - but according to the
> documentation you can't change default collation per schema or per table.
> You can set collation per column - but do you really want to set collation for
> every single column of every single supported language in your 200+ tables
> web tool?
> 
> That is a huge effort and a huge maintenance effort.

You could always write at pl/pgsql function which iterates over all columns
of type text or varchar within a schema and sets the desired collation, but

> I would like to get default collation per schema / table in 9.2 or 9.3 but
> that is my personal wish,

yeah, that'd definitely be nice.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Tom Lane

Robert Haas  writes:
> On Tue, May 22, 2012 at 10:25 AM, Tom Lane  wrote:
>> My own thoughts about this had pointed in the direction of getting rid
>> of the central freelist entirely, instead letting each backend run its
>> own independent clock sweep as needed.

> Hmm, that's certainly an interesting idea.  I fear that if the clock
> sweeps from the different backends ended up too closely synchronized,
> you would end up evicting whatever was in the way, be it hot or cold.
> It might almost be better to have individual backends choose buffers
> to evict at random; if the chosen buffer isn't evictable, we decrement
> its usage count and pick another one, also at random.

Hmm ... yeah, in principle that could be better.  It's still a clock
sweep but following a hard-to-predict ordering of the buffers.  Being
hard to predict doesn't necessarily guarantee no bad behavior though.
Maybe we could do something a bit cheaper than random() and more
amenable to analysis, that would still ensure that backends couldn't end
up with closely sync'd scans.  I'm imagining advancing not 1 buffer
each time, but K buffers where each backend will use a different value
of K.  Perhaps backend with backendID N could use the N'th prime number,
say.  You'd want something relatively prime to shared_buffers to
guarantee that the backend can visit all the buffers, and just making it
prime would do fine.

> With respect to the control logic for the background writer, one idea
> I had was to get rid of the idea that the background writer's job is
> to write in advance of the strategy point.  Instead, every time the
> clock sweep passes over a dirty buffer that is otherwise evictable, we
> add it to a queue of things that the bgwriter should clean.  Those
> buffers, once cleaned, go on the free list.  Maybe some variant of
> that could work with your idea.

Both ends of that imply centralized data structures that could become
contention bottlenecks, so it doesn't sound tremendously appealing to
me.  Maybe it can work as long as nobody has to lock the lists for long,
but I'd rather think of approaches with no central point of contention.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 12:00 PM, Susanne Ebrecht
 wrote:
> Usually you want to set the collation once per language schema. E.g. schema
> russian gets Russian collation and schema British gets British collation and
> so on.
>
> CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
> do it by creating a database.
>
> I would like to get default collation per schema / table in 9.2 or 9.3 but
> that is my personal
> wish,

Interesting idea.  Sort of like ALTER DEFAULT PRIVILEGES.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Thom Brown

On 22 May 2012 16:57, Florian Pflug  wrote:
> On May22, 2012, at 16:09 , Tom Lane wrote:
>> Thom Brown  writes:
>>> Conflicts would occur where localrolename matches an existing local
>>> role name within the same database, or a global role name, but not a
>>> local role name within another database.  The problem with this,
>>> however, is that creating global roles would need conflict checks
>>> against local roles in every database, unless a manifest of all local
>>> roles were registered globally.
>>
>> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
>> want local and global roles to have nonconflicting OIDs, else it's
>> necessary to carry around an indication of which type each role is;
>> which would be more or less a show-stopper in terms of the number of
>> catalogs and internal APIs affected.  But I don't currently see any
>> nice way to guarantee that if each database has a private table of
>> local roles.
>
> Maybe we could simply make all global role's OIDs even, and all local ones
> odd, or something like that.

Wouldn't that instantly make all previous versions of database
clusters un-upgradable?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Susanne Ebrecht


Am 22.05.2012 17:42, schrieb Tom Lane:

Encoding yes, but since 9.1 we have pretty fine-grained control of
collation.  So I think this argument is a lot weaker than it used
to be.  It would only really apply if you have one of the corner
cases where utf8 doesn't work for you.


Yeah it got better - but it isn't perfect yet.

Maybe I am blind or 9.1 documentation has a bug - but according to the
documentation you can't change default collation per schema or per table.
You can set collation per column - but do you really want to set 
collation for

every single column of every single supported language in your 200+ tables
web tool?

That is a huge effort and a huge maintenance effort.

Usually you want to set the collation once per language schema. E.g. schema
russian gets Russian collation and schema British gets British collation 
and so on.


CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
do it by creating a database.

I would like to get default collation per schema / table in 9.2 or 9.3 
but that is my personal

wish,

Susanne


--
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 10:25 AM, Tom Lane  wrote:
>> The idea would be to have a background process (like bgwriter)
>> maintain the global LRU state and push candidate buffers onto the
>> freelist.
>
> Amit was trying to convince me of the same idea at PGCon, but I don't
> buy it.  bgwriter doesn't scan the buffer array nearly fast enough to
> provide useful adjustment of the usage counts under load.  And besides
> if the decrements are decoupled from the allocation requests it's no
> longer obvious that the algorithm is even an approximation of LRU.

Well, bgwriter is *supposed* to anticipate which buffers are about to
be allocated and clean any of those that are dirty.  Having it
decrement the usage counts and stuff the resulting list of buffers
into a linked list seems like a pretty reasonable extension of that,
assuming that it works in the first place.  If it doesn't, then we
need a rethink.

> But the larger issue here is that if that processing is a bottleneck
> (which I agree it is), how does it help to force a single process to
> be responsible for it?  Any real improvement in scalability here will
> need to decentralize the operation more, not less.

Sure.  I think we could have the freelist and the clock sweep
protected by different locks.  The background writer would lock out
other people running the clock sweep, but the freelist could be
protected by a spinlock which no one would ever need to take for more
than a few cycles.  Right there, you should get a significant
scalability improvement, since the critical section would be so much
shorter than it is now.  If that's not enough, you could have several
freelists protected by different spinlocks; the bgwriter would put
1/Nth of the reusable buffers on each freelist, and backends would
pick a freelist at random to pull buffers off of.

> My own thoughts about this had pointed in the direction of getting rid
> of the central freelist entirely, instead letting each backend run its
> own independent clock sweep as needed.  The main problem with that is
> that if there's no longer any globally-visible clock sweep state, it's
> pretty hard to figure out what the control logic for the bgwriter should
> look like.  Maybe it would be all right to have global variables that
> are just statistics counters for allocations and buffers swept over,
> which backends would need to spinlock for just long enough to increment
> the counters at the end of each buffer allocation.

Hmm, that's certainly an interesting idea.  I fear that if the clock
sweeps from the different backends ended up too closely synchronized,
you would end up evicting whatever was in the way, be it hot or cold.
It might almost be better to have individual backends choose buffers
to evict at random; if the chosen buffer isn't evictable, we decrement
its usage count and pick another one, also at random.

With respect to the control logic for the background writer, one idea
I had was to get rid of the idea that the background writer's job is
to write in advance of the strategy point.  Instead, every time the
clock sweep passes over a dirty buffer that is otherwise evictable, we
add it to a queue of things that the bgwriter should clean.  Those
buffers, once cleaned, go on the free list.  Maybe some variant of
that could work with your idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Florian Pflug

On May22, 2012, at 16:09 , Tom Lane wrote:
> Thom Brown  writes:
>> Conflicts would occur where localrolename matches an existing local
>> role name within the same database, or a global role name, but not a
>> local role name within another database.  The problem with this,
>> however, is that creating global roles would need conflict checks
>> against local roles in every database, unless a manifest of all local
>> roles were registered globally.
> 
> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
> want local and global roles to have nonconflicting OIDs, else it's
> necessary to carry around an indication of which type each role is;
> which would be more or less a show-stopper in terms of the number of
> catalogs and internal APIs affected.  But I don't currently see any
> nice way to guarantee that if each database has a private table of
> local roles.

Maybe we could simply make all global role's OIDs even, and all local ones
odd, or something like that.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Tom Lane

Susanne Ebrecht  writes:
> The use case in my mind for accessing more databases is when you want to
> access stuff different languages.

> You only can set encoding / LC_Collate per database not per schema.

> So for different languages you might need different databases to do 
> correct sorting / indexing.

Encoding yes, but since 9.1 we have pretty fine-grained control of
collation.  So I think this argument is a lot weaker than it used
to be.  It would only really apply if you have one of the corner
cases where utf8 doesn't work for you.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Susanne Ebrecht


Am 22.05.2012 15:27, schrieb Albe Laurenz:

If you need different applications to routinely access each other's
tables, why not assign them to different schemas in one database?


The use case in my mind for accessing more databases is when you want to
access stuff different languages.

You only can set encoding / LC_Collate per database not per schema.

So for different languages you might need different databases to do 
correct sorting / indexing.


Susanne

--
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] weird error message in sepgsql

2012-05-22 Thread Robert Haas

On Sun, May 20, 2012 at 11:28 AM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> I found this in contrib/sepgsql/expected/label.out:
>> SECURITY LABEL ON COLUMN t2
>>     IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- be failed
>> ERROR:  improper relation name (too many dotted names): 
>
>> Contrast with:
>
>> SECURITY LABEL ON COLUMN t2.b
>>     IS 'system_u:object_r:sepgsql_ro_table_t:s0';   -- ok
>
>> I guess what's happening is that it's calling makeRangeVarFromNameList()
>> with a list of length zero.
>
>> We should either fix the SECURITY LABEL command to catch that case
>> beforehand, or fix makeRangeVarFromNameList() to give a proper error
>> message, or both.
>
> I think the appropriate error message is probably along the lines of
> "column name must be qualified", and it's hard to justify having
> makeRangeVarFromNameList emit such a thing.  So this is the fault
> of the calling code.

Fixed and back-patched to 9.1.  Thanks for the report.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Tom Lane

Robert Haas  writes:
> On Tue, May 22, 2012 at 10:01 AM, Tom Lane  wrote:
>> Well, keep in mind that that action is not merely there to obtain a
>> victim buffer; it is also maintaining the global LRU state (by
>> decrementing the usage counts of buffers it passes over).  I don't think
>> you can change it to simply look only at a predetermined freelist
>> without seriously compromising the overall quality of our buffer
>> replacement decisions.

> The idea would be to have a background process (like bgwriter)
> maintain the global LRU state and push candidate buffers onto the
> freelist.

Amit was trying to convince me of the same idea at PGCon, but I don't
buy it.  bgwriter doesn't scan the buffer array nearly fast enough to
provide useful adjustment of the usage counts under load.  And besides
if the decrements are decoupled from the allocation requests it's no
longer obvious that the algorithm is even an approximation of LRU.

But the larger issue here is that if that processing is a bottleneck
(which I agree it is), how does it help to force a single process to
be responsible for it?  Any real improvement in scalability here will
need to decentralize the operation more, not less.

My own thoughts about this had pointed in the direction of getting rid
of the central freelist entirely, instead letting each backend run its
own independent clock sweep as needed.  The main problem with that is
that if there's no longer any globally-visible clock sweep state, it's
pretty hard to figure out what the control logic for the bgwriter should
look like.  Maybe it would be all right to have global variables that
are just statistics counters for allocations and buffers swept over,
which backends would need to spinlock for just long enough to increment
the counters at the end of each buffer allocation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Draft release notes complete

2012-05-22 Thread Robert Haas

On Mon, May 21, 2012 at 9:54 PM, Noah Misch  wrote:
> On Wed, May 09, 2012 at 11:11:02PM -0400, Bruce Momjian wrote:
>> I have completed my draft of the 9.2 release notes, and committed it to
>> git.
>
> Concerning "Have psql \copy use libpq's SendQuery()", SendQuery() is a
> psql-internal interface, not a libpq interface.
>
> The array statistics patch added new columns to the pg_stats view, and it
> moved existing tsvector most-common-element statistics to those new columns.
> Let's mention that as a (minor) incompatibility.
>
> Proposed changes attached.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Thom Brown (t...@linux.com) wrote:
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

Hmm, right, that's a bit of a sticky point.  wrt your suggestion- it
works great if we don't allow duplicates, but then people have to accept
their role name is getting changed.  That said, perhaps that's not that
big of a deal, since you could rename it afterwards.

The issue with the constraints on other databases might actually be
cause enough to allow duplicates, just to avoid that issue..  We could
mirror the per-database roles into a shared space that isn't externally
visible, but at that point, maybe we should try to get RLS for the
catalogs instead, or just modify the views to only show roles which can
connect to this database.  That's not going to make them completely
transparent, but it might be enough for some use cases.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 9:37 AM, Thom Brown  wrote:
> On 22 May 2012 14:04, Stephen Frost  wrote:
>> What would the semantics of that look like though?  Which is "preferred"
>> when you do a 'grant select' or 'grant role'?  Or do we just disallow
>> overlaps between per-DB roles and global roles?  If we don't allow
>> duplicates, I suspect a lot of the other questions suddenly become a lot
>> easier to deal with, but would that be too much of a restriction?  How
>> would you handle migrating an existing global role to a per-database
>> role?
>
> Perhaps:
>
> CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
> option [ ... ] ]
>
> Then:
>
> CREATE LOCAL ROLE localrolename LIKE globalrolename;
>
> REASSIGN OWNED BY globalrolename TO localrolename;
>
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

There are race conditions to worry about, too.  In most cases, we rely
on the btree index machinery as a final backstop against duplicate
catalog entries.  But that doesn't work unless everything's in one
catalog, nor for anything more complicated than "this set of columns
taken together should be unique over every role".  Even if we were OK
with incurring the ugliness of storing per-database roles in a shared
catalog, the uniqueness constraint you'd want is something like "no
two roles can share the same name unless they have unequal database
OIDs neither of which is zero", which I don't believe we can enforce
via the btree machinery, at least not without an expression index that
won't work in a system catalog anyway.

In retrospect, I think the idea of shared catalogs was probably a bad
idea.  I think we should have made roles and tablespaces database
objects rather than shared objects, and come up with some ad-hoc
method of representing the set of available databases.  But that
decision seems to have been made sometime pre-1996, so the thought of
changing it now is pretty painful, but I can dream...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Tom Lane

Thom Brown  writes:
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
want local and global roles to have nonconflicting OIDs, else it's
necessary to carry around an indication of which type each role is;
which would be more or less a show-stopper in terms of the number of
catalogs and internal APIs affected.  But I don't currently see any
nice way to guarantee that if each database has a private table of
local roles.

You could possibly make it work if all roles, local and global, are
stored in a single shared catalog.  But that seems pretty ugly.

BTW, I wonder whether this type of problem isn't also pretty fatal for
the sorts of hierarchical catalog structures we were speculating about
at PGCon.  When we were talking about that I was sort of assuming that
the more-closely-nested levels could just hide conflicting objects at
outer levels, but on reflection that seems a bit scary.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 10:01 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Mind you, I think this whole area of the code needs some reengineering
>> for better performance, but I'm not sure this is the right place to
>> start.  What I think is really bad is that we're forcing every
>> BufferAlloc() to iterate over buffers checking whether each one is
>> evictable.
>
> Well, keep in mind that that action is not merely there to obtain a
> victim buffer; it is also maintaining the global LRU state (by
> decrementing the usage counts of buffers it passes over).  I don't think
> you can change it to simply look only at a predetermined freelist
> without seriously compromising the overall quality of our buffer
> replacement decisions.

The idea would be to have a background process (like bgwriter)
maintain the global LRU state and push candidate buffers onto the
freelist.  Then foreground processes can just pop them off the list
and recheck that they haven't been pinned meanwhile.  As long as we
don't let the background sweep get too far ahead of actual allocation
needs, I don't think this would change the quality of buffer
allocation much at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog

2012-05-22 Thread Robert Haas

On Mon, May 14, 2012 at 2:24 PM, Fujii Masao  wrote:
> On Fri, May 11, 2012 at 11:43 PM, Magnus Hagander  wrote:
>> Should we go down the easy way and just reject connections when the flag is
>> mismatching between the client and the server (trivial to do - see the
>> attached patch)?
>
> +       char       *tmpparam;
>
> You forgot to add "const" before "char", which causes a compile-time warning.

I went ahead and committed this, with this fix and a slight change to
the message text.

Hope that's OK with everyone...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Tom Lane

Robert Haas  writes:
> Mind you, I think this whole area of the code needs some reengineering
> for better performance, but I'm not sure this is the right place to
> start.  What I think is really bad is that we're forcing every
> BufferAlloc() to iterate over buffers checking whether each one is
> evictable.

Well, keep in mind that that action is not merely there to obtain a
victim buffer; it is also maintaining the global LRU state (by
decrementing the usage counts of buffers it passes over).  I don't think
you can change it to simply look only at a predetermined freelist
without seriously compromising the overall quality of our buffer
replacement decisions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "could not open relation with OID" errors after promoting the standby to master

2012-05-22 Thread Robert Haas

On Tue, May 15, 2012 at 10:37 PM, Joachim Wieland  wrote:
> If it matters, I have not promoted the master with a trigger file but
> restarted it after deleting recovery.conf.

Hmm.  I think that if you do it this way, the minimum recovery point
won't be respected, which could leave you with a corrupted database.
Now, if all the WAL files that you need are present in pg_xlog anyway,
then they ought to get replayed anyway, but I think that if you are
using restore_command (as opposed to streaming replication) we restore
WAL segments under a different file name, which might cause this
problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enhanced error fields

2012-05-22 Thread Robert Haas

On Wed, May 9, 2012 at 9:33 AM, Pavel Stehule  wrote:
> here is patch with enhancing ErrorData structure. Now constraints
> errors and RI uses these fields

Please add this to https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Readme of Buffer Management seems to have wrong sentence

2012-05-22 Thread Robert Haas

On Tue, May 8, 2012 at 9:37 PM, Amit Kapila  wrote:
> I have checked the code and logic according to which usage counter is
> increased when the buffer is pinned.

Fixed, thanks for the report.

> Another Doubt : Why in function BufferAlloc, it needs to hold the
> BufFreelistLock till it pin the buffer which increases its reference count.

Well, I think the problem is that, if we didn't do that, then, in
theory, the strategy point could wrap all the way around
shared_buffers and someone else could pin the buffer, and then we'd be
hosed.

Mind you, I think this whole area of the code needs some reengineering
for better performance, but I'm not sure this is the right place to
start.  What I think is really bad is that we're forcing every
BufferAlloc() to iterate over buffers checking whether each one is
evictable.  I think we ought to put only those buffers that we think
are likely to be evictable on the freelist, and then the actual buffer
eviction code would only need to recheck that nothing had changed,
instead of needing to scan over a potentially quite large number of
buffers that never had any chance of being selected in the first
place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Per-Database Roles

2012-05-22 Thread Thom Brown

On 22 May 2012 14:04, Stephen Frost  wrote:

> What would the semantics of that look like though?  Which is "preferred"
> when you do a 'grant select' or 'grant role'?  Or do we just disallow
> overlaps between per-DB roles and global roles?  If we don't allow
> duplicates, I suspect a lot of the other questions suddenly become a lot
> easier to deal with, but would that be too much of a restriction?  How
> would you handle migrating an existing global role to a per-database
> role?

Perhaps:

CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
option [ ... ] ]

Then:

CREATE LOCAL ROLE localrolename LIKE globalrolename;

REASSIGN OWNED BY globalrolename TO localrolename;

Conflicts would occur where localrolename matches an existing local
role name within the same database, or a global role name, but not a
local role name within another database.  The problem with this,
however, is that creating global roles would need conflict checks
against local roles in every database, unless a manifest of all local
roles were registered globally.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Albe Laurenz

Simon Riggs wrote:
> On 21 May 2012 20:40, Stephen Frost  wrote:
> 
>>> This is important. I like the idea of breaking down the barriers
>>> between databases to allow it to be an option for one backend to
>>> access tables in multiple databases.

> So collecting a few requirements from various places:
[...]
> * Allow users to access tables in >1 database easily, with appropriate
rights.

> The main objectives are to make a Database a more easily used
> administrative grouping. At present, people who use multiple Databases
> face many problems - they aren't as separate as you'd like, but
> neither can they be ignored when required.
> 
> The idea of "one main database per session" is fine, but wiring it so
> closely into the backend has a few disadvantages, many of them weird
> internal things.
> 
> Are there arguments against those requirements before we spend time on
> design/thinking?

From my perspective it is a great advantage that a user connected
to one database cannot access objects from a different database
without additional software, no matter what permissions he or she has
(short of superuser, who can do anything).

This enables us to have many different databases in one cluster
without having to worry that they can affect each other.

If you need different applications to routinely access each other's
tables, why not assign them to different schemas in one database?

For those cases where you absolutely need access to a different
database, you can use dblink or a foreign data wrapper (hopefully
in 9.3).

So -1 on that particular suggestion.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: catalog is missing 2 attribute(s) for relid 16584

2012-05-22 Thread Robert Haas

On Mon, May 14, 2012 at 1:01 AM, Prakash Itnal  wrote:
> Recently we faced an issue with postgres server where it is throwing error:
>
> ERROR:  catalog is missing 2 attribute(s) for relid 16584
> CONTEXT:  automatic analyze of table "DBRNW.public.act_wsta"
>
> I checked in the database and found that this table is not present but the
> entry for the same is present in 'pg_tables'. This error is occuring when
> auto analyze is executed.
>
> I searched for the issue in postgres blogs and found that there is no way to
> recover from this situation except restore from last backup!! I wanted to
> confirm is there any way to recover at run time? because if we restore from
> last backup then we might loose some data that got updated after this error
> occurred.
>
> Any help would be appreciated.

It sounds like you are missing a couple of pg_attribute entries.  You
might be able to figure out what's missing and put them back, but of
course whatever got your system into this state might have also caused
other corruption...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot standby PSQL 9.1 Windows 2008 Servers

2012-05-22 Thread Robert Haas

On Mon, May 14, 2012 at 8:18 AM, chinnaobi  wrote:
> I do base backup only first time on standby when it is going to be
> replicated. when ever primary goes down,  standby becomes primary and
> primary becomes standby when primary comes up. When primary becomes standby
> I am restoring data from WAL archive and start postgres service streaming
> replication to connect to primary.
>
> This setup is working.

I don't think this is safe.  The primary might have WAL that never
made it to the standby, in which case the two machines will be out of
sync with each other and all sorts of bad stuff could happen.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan  wrote:
> That seems to be leaving aside the fact that we don't currently have any
> notion of how to allow FDWs to write the foreign tables.
>
> What is more, isn't the postgres FDW about talking to any postgres source?
> If so, does it have special smarts for when we are talking to ourselves? And
> if it doesn't then it seems unlikely to be an acceptable substitute for
> allowing talking direct to a sibling database.
>
> I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
> adequately meet the case if we wanted to go that way.

Well, I don't think anyone is claiming that FDWs as they exist today
solve all of the problems in this area.  But I think that improving
FDWs is a more promising line of attack than trying to make backends
talk to multiple databases.  Doing the latter will require massive
surgery on the relcache, the catcache, most of the secondary catalog
caches, the ProcArray, and every portion of the backend that thinks an
OID uniquely identifies an SQL object.  Basically, they'd all need
database OID as an additional key field, which is undesirable for
performance reasons even if there were no issue of code churn.

So I'd rather see us put the effort into pgsql_fdw, which, as Florian
says, will also let us talk to a completely separate server.  If
you've got multiple databases in the same cluster and really need to
be doing queries across all of them, that's what schemas are
supposedly for.  Now I know that people feel that doesn't work as well
as it needs to, but there again I think it would be easier to fix
schemas than to make cross-database queries work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Per-Database Roles

2012-05-22 Thread Stephen Frost

* Simon Riggs (si...@2ndquadrant.com) wrote:
> * Ability to have a Role that can only access one Database

Alright, I'd like to think about this one specifically and solicit
feedback on the idea that we keep the existing shared role tables but
add on additional tables for per-database roles.

In the past, I feel like we've been focused on the idea of moving all
roles to be per-database instead of per-cluster, which certainly has a
lot of problems associated with it, but in the end, I think people would
be really happy with some shared roles and some per-DB roles.

What would the semantics of that look like though?  Which is "preferred"
when you do a 'grant select' or 'grant role'?  Or do we just disallow
overlaps between per-DB roles and global roles?  If we don't allow
duplicates, I suspect a lot of the other questions suddenly become a lot
easier to deal with, but would that be too much of a restriction?  How
would you handle migrating an existing global role to a per-database
role?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] heap metapages

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 4:52 AM, Simon Riggs  wrote:
> Based upon all you've said, I'd suggest that we make a new kind of
> fork, in a separate file for this, .meta. But we also optimise the VM
> and FSM in the way you suggest so that we can replace .vm and .fsm
> with just .meta in most cases. Big tables would get a .vm and .fsm
> appearing when they get big enough, but that won't challenge the inode
> limits. When .vm and .fsm do appear, we remove that info from the
> metapage - that means we keep all code as it is currently, accept for
> an optimisation of .vm and .fsm when those are small enough to do so.

Well, let's see.  That would mean that a small heap relation has 2
forks instead of 3, and a large relation has 4 forks instead of 3.  In
my proposal, a small relation has 1 fork instead of 3, and a large
relation still has 3 forks.  So I like mine better.

Also, I think that we need a good chunk of the metadata here for both
tables and indexes.  For example, if we use the metapage to store
information about whether a relation is logged, unlogged, being
converted from logged to unlogged, or being converted from logged to
unlogged, we need that information both for tables and for indexes.
Now, there's no absolute reason why those cases have to be handled
symmetrically, but I think things will be a lot simpler if they are.
If we settle on the rule that block 0 of every relation contains a
certain chunk of metadata at a certain byte offset, then the code to
retrieve that data when needed is pretty darn simple.  If tables put
it in a separate fork and indexes put it in the main fork inside the
metablock somewhere, then things are not so simple.  And I sure don't
want to add a separate fork for every index just to hold the metadata:
that would be a huge hit in terms of total inode consumption.

> We can watermark data files using special space on block zero using
> some code to sneak that in when the page is next written, but that is
> regarded as optional, rather than an essential aspect of an
> upgrade/normal operation.
>
> Having pg_upgrade touch data files is both dangerous and difficult to
> back out in case of mistake, so I am wary of putting the metapage at
> block 0. Doing it the way I suggest means the .meta files would be
> wholly new and can be deleted as a back-out. We can also clean away
> any unnecessary .vm/.fsm files as a later step.

It seems pretty clear to me that making pg_upgrade responsible for
emptying block zero is a non-starter.  But I don't think that's a
reason to throw out the design; I think it's a problem we can work
around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Andrew Dunstan




On 05/22/2012 07:56 AM, Robert Haas wrote:

On Tue, May 22, 2012 at 7:35 AM, Florian Pflug  wrote:

* Allow users to access tables in>1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

+1.




That seems to be leaving aside the fact that we don't currently have any 
notion of how to allow FDWs to write the foreign tables.


What is more, isn't the postgres FDW about talking to any postgres 
source? If so, does it have special smarts for when we are talking to 
ourselves? And if it doesn't then it seems unlikely to be an acceptable 
substitute for allowing talking direct to a sibling database.


I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW 
would adequately meet the case if we wanted to go that way.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 1:50 AM, Tom Lane  wrote:
> Currently, the planner keeps paths that appear to win on the grounds of
> either cheapest startup cost or cheapest total cost.  It suddenly struck
> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
> fast-start preference, etc) we could know a-priori that cheapest startup
> cost is not going to be interesting, and hence immediately discard any
> path that doesn't win on total cost.
>
> This would require some additional logic to detect whether the case
> applies, as well as extra complexity in add_path.  So it's possible
> that it wouldn't be worthwhile overall.  Still, it seems like it might
> be a useful idea to investigate.
>
> Thoughts?

Yeah, I think we should investigate that.  Presumably you could easily
have a situation where one part of the tree is under a LIMIT or EXISTS
and therefore needs to preserve fast-start plans but the rest of the
(potentially large) tree isn't, so we need something fairly
fine-grained, I think.  Maybe we could add a flag to each RelOptInfo
indicating whether fast-start plans should be kept, or something like
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Florian Pflug

On May22, 2012, at 13:47 , Simon Riggs wrote:
> On 22 May 2012 12:35, Florian Pflug  wrote:
>>> * Allow users to access tables in >1 database easily, with appropriate 
>>> rights.
>> 
>> That one I'm very sceptical about. In the long run, I think we want better
>> separation of databases, not less, and this requirement carries a huge risk
>> of standing in the way of that. Also, I think that once we integrate the 
>> postgres
>> FDW into core (that's the plan in the long run, right?), we're going to get
>> a good approximation of that essentially for free.
> 
> It's a poor approximation of it, free or not.
> 
> If it actually worked well, I'd be happy. It doesn't. No proper
> transaction support, no session pool, poor planning etc etc. At best
> its a band-aid, not something sufficiently good to recommend for
> general production use.

These all sound fixable, though. 

> FDWs are very good but aren't good enough for intra-database usage.

OTOH, cross-database queries would by design be limited to databases
within one cluster, whereas a FDW-based solution would not. I don't really see
the different between telling people "put all your tables into one database
if you need to access them from within one session" and "put all your databases
on one server if you need to access them from within one session".

Plus, the more tightly different databases in the same cluster are coupled,
the more people will setup one cluster per database for performance reasons.
Then, when they discovered they need inter-database queries after all, we'll
again have to tell them "well, then don't set things up the way you have".

If we want to make it easier for people to migrate from multiple databases
to a single database with multiple schemas, maybe we should look into allowing
nested schemas? AFAIK, the main reason not to do that are ambiguities in the
meaning of identifiers, which cross-database queries would have to deal with
also.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 13:47, Simon Riggs wrote:

On 22 May 2012 12:35, Florian Pflug  wrote:

* Allow users to access tables in>1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

It's a poor approximation of it, free or not.

If it actually worked well, I'd be happy. It doesn't. No proper
transaction support, no session pool, poor planning etc etc. At best
its a band-aid, not something sufficiently good to recommend for
general production use.

- Transaction support: should be fixed, I guess.

- Session pool: is this really needed? I would it externally -- for 
example, by specifying a connection string to a pgPool/pgBouncer as 
opposed to directly to the db server.


- Planning: add a tunable specifying a higher cost (with a exception for 
cluster-local dblinks, which would have a lower cost), and the rest 
should be straightforward. Of course, planning would'nt be as accurate 
---we can't access the other db statistics in order to potentially 
rewrite conditions---, but I don't think that would be a proper approach 
(separation concerns, again)



FDWs are very good but aren't good enough for intra-database usage.


The optimization I just proposed (plus the required fixes to FDW) might 
very well solve this, while providing useful enhancements for all users, 
whether they are accessing cluster-local databases or not (or even truly 
foreign datasources)



Regards,

Jose Luis Tallon


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 13:24, Simon Riggs wrote:

On 22 May 2012 12:05, José Luis Tallón  wrote:


IMVHO:  s/database/schema/g does resolve many of the problems that you were
referring to... and 'dblink' should solve the rest, right?
Please, feel free to point out what I am (most probably) not considering --
not experienced enough yet :)

The choice of schema/database is an important one. If you get it
wrong, you are in major difficulty. In many cases schemas would be a
better choice, but not in all cases. So I'm interested in solving the
problems for people who have multiple databases on same server.


Ok. Understood.
Thank you for the clarification


dblink is the only solution, but its very poor way to do this when we
have 2 databases on same server.

My thinking is that reaching out to multiple databases is actually
mostly easy, except in a few places where dbid is hardwired into the
backend.


The only drawback I see is that it might weaken the separation.

Even though arguably a kludge, dblink could have a "shortcut" added, 
whereby connections to another database within the same cluster would be 
serviced directly within the backend, as opossed to opening a new db 
connection. This is effectively a fastpath within dblink, which 
optimizes a relatively common case while at the same time not loosing 
generality.



On the other hand, the separation of databases allows what otherwise would
only be possible by using multiple instances of the database server (à la
Oracle, AFAIK ) -- save for resource management, but that is another
question whatsoever.

Separation of databases is fine. I have no intention to change that,
as long as the user wishes that.


Perfect.

Thanks,

Jose Luis Tallon


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem with error response message

2012-05-22 Thread Tatsuo Ishii

Hi,

ErrorResponse message from backend has some field type codes include:

S
Severity: the field contents are ERROR, FATAL, or PANIC (in an
error message), or WARNING, NOTICE, DEBUG, INFO, or LOG (in a
notice message), or a localized translation of one of
these. Always present.

C
Code: the SQLSTATE code for the error (see Appendix A). Not
localizable. Always present.

If severity is ERROR, current transaction is aborted but existing
session is continued. If it is FATAL or worse, backend disconnects the
current session, or even postmaster restarts all backends (which
results in session disconnection anyway). Cluster management tools
such as pgpool-II has to carefully handle the message because there's
big difference between just adborting transaction and disconnecting
session. In the former, pgpool just forwards the error message to
frontend, but for later, it should disconnect the session. For this
purpose, pgpool-II examines SQLSTAE code. Unfortunately some error
codes such as 40001(serialization_failure) comes with different
severities ERROR or FATAL.

Here are some examples:

./tcop/postgres.c-  ereport(FATAL,
./tcop/postgres.c:  
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
./tcop/postgres.c-errmsg("terminating connection due to 
conflict with recovery"),
./tcop/postgres.c-   
errdetail_recovery_conflict()));

./commands/trigger.c-   ereport(ERROR,
./commands/trigger.c:   
(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
./commands/trigger.c-
errmsg("could not serialize access due to concurrent update")));
./commands/trigger.c-   if 
(!ItemPointerEquals(&update_ctid, &tuple.t_self))

So I tried to look into severity code to find if it is a fatal error
or not.  Unfortunately it is almost impossible because it could be
localized. My question is, why we need to localize this? It shouldn't
be computer recognizable message?

Also I wonder why "conflict with recovery" generates same error code
as serialization error. For me it seems not very consistent
choice. IMO, we should assign different error code for at least brings
totally different effect to frontend: aborting transactions or
sessions.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Robert Haas

On Tue, May 22, 2012 at 7:35 AM, Florian Pflug  wrote:
>> * Allow users to access tables in >1 database easily, with appropriate 
>> rights.
>
> That one I'm very sceptical about. In the long run, I think we want better
> separation of databases, not less, and this requirement carries a huge risk
> of standing in the way of that. Also, I think that once we integrate the 
> postgres
> FDW into core (that's the plan in the long run, right?), we're going to get
> a good approximation of that essentially for free.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 12:35, Florian Pflug  wrote:

>> * Allow users to access tables in >1 database easily, with appropriate 
>> rights.
>
> That one I'm very sceptical about. In the long run, I think we want better
> separation of databases, not less, and this requirement carries a huge risk
> of standing in the way of that. Also, I think that once we integrate the 
> postgres
> FDW into core (that's the plan in the long run, right?), we're going to get
> a good approximation of that essentially for free.

It's a poor approximation of it, free or not.

If it actually worked well, I'd be happy. It doesn't. No proper
transaction support, no session pool, poor planning etc etc. At best
its a band-aid, not something sufficiently good to recommend for
general production use.

FDWs are very good but aren't good enough for intra-database usage.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Florian Pflug

On May22, 2012, at 11:46 , Simon Riggs wrote:
> * Ability to have a Role that can only access one Database
> 
> * Allow user info to be dumped with a database, to make a db
> completely self-consistent

These two could be achieved by having database-local roles I think.

> * Allow databases to be transportable

That would be very useful, but extremely hard to do unless we switch to
per-database XID spaces. Or unless we're content with having databases
only be transportable after some special "MAKE TRANSPORTABLE" operation,
which would freeze all tuples and make the database read-only.

> * Allow users to access tables in >1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the 
postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-05-22 Thread Simon Riggs

On 22 May 2012 12:12, PostgreSQL - Hans-Jürgen Schönig
 wrote:
>
> On May 22, 2012, at 9:57 AM, Simon Riggs wrote:
>
>> On 22 May 2012 06:50, Tom Lane  wrote:
>>
>>> Currently, the planner keeps paths that appear to win on the grounds of
>>> either cheapest startup cost or cheapest total cost.  It suddenly struck
>>> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
>>> fast-start preference, etc) we could know a-priori that cheapest startup
>>> cost is not going to be interesting, and hence immediately discard any
>>> path that doesn't win on total cost.
>>
>> My experience is that most people don't provide a LIMIT explicitly
>> even when they know that's the desired behaviour. That's because
>> either they simply don't understand that SQL can return lots of rows,
>> or SQL knowledge isn't enough, or worse that people don't even know
>> that specifying it would alter query plans.
>>
>> Regrettably the current planning of LIMIT clauses causes more problems
>> so in many cases these have been explicitly removed from SQL by
>> developers that know how many rows they wish to see.
>>
>> I would have proposed a default-LIMIT parameter before now, but for
>> that last point.
>
>
> this sounds like a total disaster to me ...
> why in the world should we have a default LIMIT parameter?

Its common for people to write SQL they want to have a fast response
to, yet to not specify that in their SQL code.

> i guess if somebody is not able to use LIMIT he should better not touch the 
> DB.
> we clearly cannot fix incompetence by adding parameters.

Not sure I understand that argument. Many things are set by default in
parameters.

Oracle allows the user to choose whether plans return rows quickly, or
execute plans efficiently overall. That is a relevant user choice, not
a hint. We simply don't know that if LIMIT is absent whether the user
wishes fast or efficient plans. If they use LIMIT we know they want
fast plan.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-05-22 Thread Tatsuo Ishii

> Thanks for your comments. They clarify a lot.
> But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
> Isn't it possible for them to produce same pg_wchar?

If LB is in 0x90 - 0x99 range, then they are LC2.
If LB is in 0xf0 - 0xff range, then they are LCPRV2.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 22 May 2012 12:05, José Luis Tallón  wrote:

> IMVHO:  s/database/schema/g does resolve many of the problems that you were
> referring to... and 'dblink' should solve the rest, right?
> Please, feel free to point out what I am (most probably) not considering --
> not experienced enough yet :)

The choice of schema/database is an important one. If you get it
wrong, you are in major difficulty. In many cases schemas would be a
better choice, but not in all cases. So I'm interested in solving the
problems for people who have multiple databases on same server.

dblink is the only solution, but its very poor way to do this when we
have 2 databases on same server.

My thinking is that reaching out to multiple databases is actually
mostly easy, except in a few places where dbid is hardwired into the
backend.

> On the other hand, the separation of databases allows what otherwise would
> only be possible by using multiple instances of the database server (à la
> Oracle, AFAIK ) -- save for resource management, but that is another
> question whatsoever.

Separation of databases is fine. I have no intention to change that,
as long as the user wishes that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-05-22 Thread PostgreSQL - Hans-Jürgen Schönig


On May 22, 2012, at 9:57 AM, Simon Riggs wrote:

> On 22 May 2012 06:50, Tom Lane  wrote:
> 
>> Currently, the planner keeps paths that appear to win on the grounds of
>> either cheapest startup cost or cheapest total cost.  It suddenly struck
>> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
>> fast-start preference, etc) we could know a-priori that cheapest startup
>> cost is not going to be interesting, and hence immediately discard any
>> path that doesn't win on total cost.
> 
> My experience is that most people don't provide a LIMIT explicitly
> even when they know that's the desired behaviour. That's because
> either they simply don't understand that SQL can return lots of rows,
> or SQL knowledge isn't enough, or worse that people don't even know
> that specifying it would alter query plans.
> 
> Regrettably the current planning of LIMIT clauses causes more problems
> so in many cases these have been explicitly removed from SQL by
> developers that know how many rows they wish to see.
> 
> I would have proposed a default-LIMIT parameter before now, but for
> that last point.


this sounds like a total disaster to me ... 
why in the world should we have a default LIMIT parameter? 
i guess if somebody is not able to use LIMIT he should better not touch the DB.
we clearly cannot fix incompetence by adding parameters. 

regards,

hans


--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread José Luis Tallón


On 22/05/12 11:46, Simon Riggs wrote:

On 21 May 2012 20:40, Stephen Frost  wrote:

This is important. I like the idea of breaking down the barriers
between databases to allow it to be an option for one backend to
access tables in multiple databases. The current mechanism doesn't
actually prevent looking at data from other databases using internal
APIs, so full security doesn't exist. It's a very common user
requirement to wish to join tables stored in different databases,
which ought to be possible more cleanly with correct privileges.

That's really a whole different ball of wax and I don't believe what
Robert was proposing would actually allow that to happen due to the
other database-level things which are needed to keep everything
consistent...  That's my understanding, anyway.  I'd be happy as anyone
if we could actually make it work, but isn't like the SysCache stuff per
database?  Also, cross-database queries would actually make it more
difficult to have per-database roles, which is one thing that I was
hoping we might be able to work into this, though perhaps we could have
a shared roles table and a per-database roles table and only 'global'
roles would be able to issue cross-database queries..


IMVHO:  s/database/schema/g does resolve many of the problems that you 
were referring to... and 'dblink' should solve the rest, right?
Please, feel free to point out what I am (most probably) not considering 
-- not experienced enough yet :)


On the other hand, the separation of databases allows what otherwise 
would only be possible by using multiple instances of the database 
server (à la Oracle, AFAIK ) -- save for resource management, but that 
is another question whatsoever.



So collecting a few requirements from various places:

* Ability to have a Role that can only access one Database


Yes, please


* Allow user info to be dumped with a database, to make a db
completely self-consistent

+1


* Allow databases to be transportable
+1. Ideally, the binary format could be make platform-independent, so 
that a snapshot/rsync of the cluster can span architectures easily.
AFAIK, endianness-change is relatively cheap on current processors [1 
ASM instruction?] and it's not like we are memory-mapping tuples anyway 
(TOASTed values can certainly not be mapped), so it shouldn't be 
noticeable performance-wise.



* Allow users to access tables in>1 database easily, with appropriate rights.


See above, but I am probably wrong ...



I don't see any reasons why these things would be against each other.


Look quite orthogonal to me.


The main objectives are to make a Database a more easily used
administrative grouping. At present, people who use multiple Databases
face many problems - they aren't as separate as you'd like, but
neither can they be ignored when required.

The idea of "one main database per session" is fine, but wiring it so
closely into the backend has a few disadvantages, many of them weird
internal things.

Are there arguments against those requirements before we spend time on
design/thinking?


OTOH, the postmaster/cluster - session/database coupling looks to me 
clean, simple... and seems to make the code simpler. This is can only be 
good (but again, I don't know enough yet to be sure)


Regards,

Jose Luis Tallon




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-05-22 Thread Alexander Korotkov

On Tue, May 22, 2012 at 11:50 AM, Tatsuo Ishii  wrote:
>
> I think it's possible. The first characters are defined like this:
>
> #define IS_LCPRV1(c)((unsigned char)(c) == 0x9a || (unsigned char)(c)
> == 0x9b)
> #define IS_LCPRV2(c)((unsigned char)(c) == 0x9c || (unsigned char)(c)
> == 0x9d)
>
> It seems IS_LCPRV1 is not used in any of PostgreSQL supported
> encodings at this point, that means there's 0 chance which existing
> databases include LCPRV1. So you could safely ignore it.
>
> For IS_LCPRV2, it is only used for Chinese encodings (EUC_TW and BIG5)
> in backend/utils/mb/conversion_procs/euc_tw_and_big5/euc_tw_and_big5.c
> and it is fixed to 0x9d.  So you can always restore the value to 0x9d.
>
> > Also in this part of code we're shifting first byte by 16 bits:
> >
> > if (IS_LC1(*from) && len >= 2)
> > {
> > *to = *from++ << 16;
> > *to |= *from++;
> > len -= 2;
> > }
> > else if (IS_LCPRV1(*from) && len >= 3)
> > {
> > from++;
> > *to = *from++ << 16;
> > *to |= *from++;
> > len -= 3;
> > }
> >
> > Why don't we shift it by 8 bits?
>
> Because we want the first byte of LC1 case to be placed in the second
> byte of wchar. i.e.
>
> 0th byte: always 0
> 1th byte: leading byte (the first byte of the multibyte)
> 2th byte: always 0
> 3th byte: the second byte of the multibyte
>
> Note that we always assume that the 1th byte (called "leading byte":
> LB in short) represents the id of the character set (from 0x81 to
> 0xff) in MULE INTERNAL encoding. For the mapping between LB and
> charsets, see pg_wchar.h.


Thanks for your comments. They clarify a lot.
But I still don't realize how can we distinguish IS_LCPRV2 and IS_LC2?
Isn't it possible for them to produce same pg_wchar?

--
With best regards,
Alexander Korotkov.

[HACKERS] Changing the concept of a DATABASE

2012-05-22 Thread Simon Riggs

On 21 May 2012 20:40, Stephen Frost  wrote:

>> This is important. I like the idea of breaking down the barriers
>> between databases to allow it to be an option for one backend to
>> access tables in multiple databases. The current mechanism doesn't
>> actually prevent looking at data from other databases using internal
>> APIs, so full security doesn't exist. It's a very common user
>> requirement to wish to join tables stored in different databases,
>> which ought to be possible more cleanly with correct privileges.
>
> That's really a whole different ball of wax and I don't believe what
> Robert was proposing would actually allow that to happen due to the
> other database-level things which are needed to keep everything
> consistent...  That's my understanding, anyway.  I'd be happy as anyone
> if we could actually make it work, but isn't like the SysCache stuff per
> database?  Also, cross-database queries would actually make it more
> difficult to have per-database roles, which is one thing that I was
> hoping we might be able to work into this, though perhaps we could have
> a shared roles table and a per-database roles table and only 'global'
> roles would be able to issue cross-database queries..

So collecting a few requirements from various places:

* Ability to have a Role that can only access one Database

* Allow user info to be dumped with a database, to make a db
completely self-consistent

* Allow databases to be transportable

* Allow users to access tables in >1 database easily, with appropriate rights.


I don't see any reasons why these things would be against each other.

The main objectives are to make a Database a more easily used
administrative grouping. At present, people who use multiple Databases
face many problems - they aren't as separate as you'd like, but
neither can they be ignored when required.

The idea of "one main database per session" is fine, but wiring it so
closely into the backend has a few disadvantages, many of them weird
internal things.

Are there arguments against those requirements before we spend time on
design/thinking?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] heap metapages

2012-05-22 Thread Simon Riggs

On 22 May 2012 02:50, Robert Haas  wrote:

>> Not very sure why a metapage is better than a catalog table.
>
> Mostly because there's no chance of the startup process accessing a
> catalog table during recovery, but it can read a metapage.

OK, sounds reasonable.

Based upon all you've said, I'd suggest that we make a new kind of
fork, in a separate file for this, .meta. But we also optimise the VM
and FSM in the way you suggest so that we can replace .vm and .fsm
with just .meta in most cases. Big tables would get a .vm and .fsm
appearing when they get big enough, but that won't challenge the inode
limits. When .vm and .fsm do appear, we remove that info from the
metapage - that means we keep all code as it is currently, accept for
an optimisation of .vm and .fsm when those are small enough to do so.

We can watermark data files using special space on block zero using
some code to sneak that in when the page is next written, but that is
regarded as optional, rather than an essential aspect of an
upgrade/normal operation.

Having pg_upgrade touch data files is both dangerous and difficult to
back out in case of mistake, so I am wary of putting the metapage at
block 0. Doing it the way I suggest means the .meta files would be
wholly new and can be deleted as a back-out. We can also clean away
any unnecessary .vm/.fsm files as a later step.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-05-22 Thread Simon Riggs

On 22 May 2012 06:50, Tom Lane  wrote:

> Currently, the planner keeps paths that appear to win on the grounds of
> either cheapest startup cost or cheapest total cost.  It suddenly struck
> me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
> fast-start preference, etc) we could know a-priori that cheapest startup
> cost is not going to be interesting, and hence immediately discard any
> path that doesn't win on total cost.

My experience is that most people don't provide a LIMIT explicitly
even when they know that's the desired behaviour. That's because
either they simply don't understand that SQL can return lots of rows,
or SQL knowledge isn't enough, or worse that people don't even know
that specifying it would alter query plans.

Regrettably the current planning of LIMIT clauses causes more problems
so in many cases these have been explicitly removed from SQL by
developers that know how many rows they wish to see.

I would have proposed a default-LIMIT parameter before now, but for
that last point.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 101 matches

Mail list logo