date:20120614

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-14 Thread Simon Riggs

On 6 April 2012 07:21, Peter Eisentraut  wrote:
> On lör, 2012-03-24 at 10:01 +, Gianni Ciolli wrote:
>> ON (DELETE | UPDATE) actions for EACH foreign keys
>> ==
>>
>> -- --- ---
>>                   |    ON     |    ON     |
>> Action            |  DELETE   |  UPDATE   |
>> -- --- ---
>> CASCADE           |    Row    | Forbidden |
>> SET NULL          |    Row    |    Row    |
>> SET DEFAULT       |    Row    |    Row    |
>> EACH CASCADE      |  Element  |  Element  |
>> EACH SET NULL     |  Element  |  Element  |
>> EACH SET DEFAULT  | Forbidden | Forbidden |
>> NO ACTION         |     -     |     -     |
>> RESTRICT          |     -     |     -     |
>> -- - -
>>
> I took another fresh look at this feature after not having looked for a
> month or two.  I think the functionality is probably OK, but I find the
> interfaces somewhat poorly named.  Consider, "PostgreSQL adds EACH
> foreign keys" -- huh?  I think they key word ELEMENT would be more
> descriptive and precise, and it also leaves the door open to other kind
> of non-atomic foreign key relationships outside of arrays.  EACH has no
> relationship with arrays.  It might as well refer to each row.
>
> On the matter of the above chart, there has been a long back and forth
> about whether the row or the element case should be the default.  Both
> cases are probably useful, but unfortunately you have now settled on
> making maximum destruction the default.  Additionally, we would now have
> the case that sometimes, depending on some configuration elsewhere, an
> ON DELETE CASCADE deletes more than what was actually involved in the
> foreign key.  What I'd suggest is to make both cases explicit.  That is,
> forbid ON DELETE CASCADE altogether and make people write ON DELETE
> CASCADE ROW or ON DELETE CASCADE ELEMENT.  In addition to making things
> more explicit and safer, it would again leave the door open to other
> kinds of relationships later.

+1

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Magnus Hagander

On Fri, Jun 15, 2012 at 10:19 AM, Bruce Momjian  wrote:
> On Thu, Jun 14, 2012 at 02:43:04PM -0400, Tom Lane wrote:
>> Euler Taveira  writes:
>> > On 14-06-2012 02:19, Tom Lane wrote:
>> >> ...  I especially think that importing bzip2
>> >> is a pretty bad idea --- it's not only a new dependency, but bzip2's
>> >> compression versus speed tradeoff is entirely inappropriate for this
>> >> use-case.
>>
>> > I see, the idea is that bzip2 would be a compiled-in option (not enabled by
>> > default) just to give another compression option.
>>
>> I'm not particularly thrilled with "let's have more compression options
>> just to have options".  Each such option you add is another source of
>> fail-to-connect incompatibilities (when either the client or the server
>> doesn't have it).  Moreover, while there are a lot of compression
>> algorithms out there, a lot of them are completely unsuited for this
>> use-case.  If memory serves, bzip2 for example requires fairly large
>> data blocks in order to get decent compression, which means you are
>> either going to get terrible compression or suffer very bad latency
>> when trying to apply it to a connection data stream.

Agreed. I think there's probably arguments to be had for supporting
compression without openssl (see below), but I don't think we need to
have a whole set of potentially incompatible ways of doing it. Picking
one that's good for the common case and not completely crap for the
corner cases would be a better choice (meaning bzip2 is probably a
very bad choice).

>> So I've got very little patience with the idea of "let's put in some
>> hooks and then great things will happen".  It would be far better all
>> around if we supported exactly one, well-chosen, method.  But really
>> I still don't see a reason not to let openssl do it for us.
>
> Do we just need to document SSL's NULL encryption option?

Does the SSL NULL encryption+compression thing work if you're not
using openssl?

For one thing, some of us still hold a hope to support non-openssl
libraries in both libpq and server side, so it's something that would
need to be supported by the standard and thus available in most
libraries not to invalidate that.

Second, we also have things like the JDBC driver and the .Net driver
that don't use libpq. the JDBC driver uses the native java ssl
support, AFAIK. Does that one support the compression, and does it
support controlling it?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Saving snapshots for later use

2012-06-14 Thread Heikki Linnakangas


On 15.06.2012 06:19, Nikolas Everett wrote:

I've been a PostgreSQL user for years and have a feature that I'd like to
see implemented.  I've started playing with the source and have a vague
implementation plan for it, but before I go blundering around I'd like to
run the idea by this list.  So here it is:

I'd like to be able to save the current snapshot and then at a later date
roll the entire database back to that snapshot, essentially erasing
everything that happened since the snapshot.  Well, more like making all
that stuff invisible.  This would be really useful when testing
applications who's databases take a while to setup and who's tests change
the database in some way.  It could also provide a way to recover from some
horrible user error.  Frankly, I'm more interested in this for testing, but
having the ability to recover from errors would be nice.


Have you considered using filesystem snapshots?


So I'm wondering a few things:
1.  I _think_ all I have to do to store one of these snapshots is to store
the current xid and a list of all in progress transactions.  I believe I
should be able to reconstitute this with the commit log to determine
visibility at that snapshot.
2.  I _think_ I can save that information in a system table.  It feels
weird to use a table to store visibility information but part of me thinks
that is ok.  I can't put my finger on why I think that is ok though.
3.  I'm not really sure at all what to do with other connections that are
doing things during a snapshot restore.  They might just have to get cut.
  For my purposes this would be ok but I can immagine that would be a
problem.
4.  I think not allowing the user to save a snapshot during a transaction
would simplify the book keeping a ton.  I know what I typed in (1) wouldn't
cover snapshots inside transactions.  I just don't think that is worth the
effort.
5.  On a more personal note I have no idea if I am capable of implementing
this.  I'm really not very good with C but I haven't had too much trouble
figuring out how to add new parser rules or map them into the tcop layer.
  I figured out how to create a new system table without too much trouble
but haven't a clue how upgrades work for the new table.  I'm not sure how
to read from the system table but I see there is a faq entry for that.
6.  I think it'd be ok if restoring an older snapshot removes a newer
snapshot.  Using the snapshot mechanism to jump forward in time just sounds
hard to keep strait.


To revert the database to the earlier state, you'll also need to somehow 
roll back all the already-committed transactions. At first sight, that 
seems easy - just modify clog to mark them as aborted. However, it's not 
that easy, because you'd also need to somehow clear hint bits that claim 
those transactions to be committed. Or prevent those hint bits from 
being set in the first place, but that affects performance, so I don't 
think that would be very useful for testing. All in all, I think a 
filesystem snapshot is simpler. Or even just use a copy of the database, 
and use rsync to revert it back to the original state. You will have to 
restart postgres, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Saving snapshots for later use

2012-06-14 Thread Nikolas Everett

I've been a PostgreSQL user for years and have a feature that I'd like to
see implemented.  I've started playing with the source and have a vague
implementation plan for it, but before I go blundering around I'd like to
run the idea by this list.  So here it is:

I'd like to be able to save the current snapshot and then at a later date
roll the entire database back to that snapshot, essentially erasing
everything that happened since the snapshot.  Well, more like making all
that stuff invisible.  This would be really useful when testing
applications who's databases take a while to setup and who's tests change
the database in some way.  It could also provide a way to recover from some
horrible user error.  Frankly, I'm more interested in this for testing, but
having the ability to recover from errors would be nice.

Once you've saved the snapshot it'd be tempting to expose the ability to
run queries against the old snapshot as well.  This seems like more of a
bonus in my mind, though.

>From here on out I'm not sure if I have my terms right so please correct me
if I'm wrong.

So I'm wondering a few things:
1.  I _think_ all I have to do to store one of these snapshots is to store
the current xid and a list of all in progress transactions.  I believe I
should be able to reconstitute this with the commit log to determine
visibility at that snapshot.
2.  I _think_ I can save that information in a system table.  It feels
weird to use a table to store visibility information but part of me thinks
that is ok.  I can't put my finger on why I think that is ok though.
3.  I'm not really sure at all what to do with other connections that are
doing things during a snapshot restore.  They might just have to get cut.
 For my purposes this would be ok but I can immagine that would be a
problem.
4.  I think not allowing the user to save a snapshot during a transaction
would simplify the book keeping a ton.  I know what I typed in (1) wouldn't
cover snapshots inside transactions.  I just don't think that is worth the
effort.
5.  On a more personal note I have no idea if I am capable of implementing
this.  I'm really not very good with C but I haven't had too much trouble
figuring out how to add new parser rules or map them into the tcop layer.
 I figured out how to create a new system table without too much trouble
but haven't a clue how upgrades work for the new table.  I'm not sure how
to read from the system table but I see there is a faq entry for that.
6.  I think it'd be ok if restoring an older snapshot removes a newer
snapshot.  Using the snapshot mechanism to jump forward in time just sounds
hard to keep strait.

I'd attach a patch of my work so far but it certainly isn't anything mind
blowing at this point.

Sorry to ramble on for a solid page.  Thanks for everyone who made it to
the end.

Cheers,

Nik

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 6:30 PM, Peter Geoghegan  wrote:
>> Here we know that it doesn't matter, so the application of Knuth's first law
>> of optimization is appropriate.
>
> I'm not advocating some Byzantine optimisation, or even something that
> could reasonably be described as an optimisation at all here. I'm
> questioning why you've unnecessarily complicated the code by having
> the buffer size just big enough to fit the biggest value seen so far,
> but arbitrarily aligned to a value that is completely irrelevant to
> bttextfastcmp_locale(), rather than using simple geometric expansion,
> which is more or less the standard way of managing the growth of a
> dynamic array.

Well, so... palloc, and most malloc implementations, don't actually
just double every time forever.  They double up to some relatively
small number like 1MB, and then they expand in 1MB chunks.
std::vector may well behave as you're describing, but that's doing
something completely different.  We're not accumulating a string of
unknown length into a buffer, with the risk of having to copy the
whole buffer every time we reallocate.  We know the exact size of the
buffer we need for any given string, so the cost of a pfree/palloc is
only the cost of releasing and allocating memory; there is no
additional copying cost, as there would be for std::vector.

Look at it another way.  Right now, the worst case number of temporary
buffer allocations is about 2N lg N, assuming we do about N lg N
comparisons during a sort.  If we simply implemented the most naive
buffer reallocation algorithm possible, we might in the worst case
reallocate each buffer up to N times.  That is already better than
what we do now by a factor of lg N.  If we suppose N is about a
million, that's an improvement of 20x *in the worst case* - typically,
the improvement will be much larger, because all the strings will be
of similar length or we won't hit them in exactly increasing order of
length.  The algorithm I've actually implemented bounds the worst case
1024 times more tightly - given N strings, we can't need to enlarge
the buffer more than N/1024 times.  So with N of around a million,
this algorithm should eliminate *at least* 99.995% of the pallocs we
currently do .  How much better does it need to be?

> You have to grow the array in some way. The basic approach I've
> outlined has something to recommend it - why does it make sense to
> align the size of the buffer to TEXTBUFLEN in particular though?

There's nothing magic about TEXTBUFLEN as far as that goes.  We could
round to the nearest multiple of 8kB or whatever if that seemed
better.  But if we rounded to, say, the nearest 1MB, then someone
sorting strings that are 2kB long would use 2MB of memory for these
buffers.  Considering that we ship with work_mem = 1MB, that could
easily end up wasting almost twice work_mem.  I don't see how you can
view that as a trivial problem.  It might be worth sucking that up if
it seemed likely to dramatically improve performance, but it doesn't.

> It's
> quite easy to imagine what you've done here resulting in an excessive
> number of allocations (and pfree()s), which *could* be expensive.

Actually, I can't imagine that at all, per the above analysis.  If I
could, I might agree with you.  :-)

> There is a trade-off between space and time to be made here, but I
> don't know why you think that the right choice is to use almost the
> smallest possible amount of memory in all cases.

Because I think that with the current implementation I can have my
cake and eat it, too.  I believe I've gotten essentially all the
available speedup for essentially no memory wastage.  If you can
convince me that I've missed something and there is still a meaningful
amount of palloc overhead left to be squeezed out, I'm all ears.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Bruce Momjian

On Thu, Jun 14, 2012 at 02:43:04PM -0400, Tom Lane wrote:
> Euler Taveira  writes:
> > On 14-06-2012 02:19, Tom Lane wrote:
> >> ...  I especially think that importing bzip2
> >> is a pretty bad idea --- it's not only a new dependency, but bzip2's
> >> compression versus speed tradeoff is entirely inappropriate for this
> >> use-case.
> 
> > I see, the idea is that bzip2 would be a compiled-in option (not enabled by
> > default) just to give another compression option.
> 
> I'm not particularly thrilled with "let's have more compression options
> just to have options".  Each such option you add is another source of
> fail-to-connect incompatibilities (when either the client or the server
> doesn't have it).  Moreover, while there are a lot of compression
> algorithms out there, a lot of them are completely unsuited for this
> use-case.  If memory serves, bzip2 for example requires fairly large
> data blocks in order to get decent compression, which means you are
> either going to get terrible compression or suffer very bad latency
> when trying to apply it to a connection data stream.
> 
> So I've got very little patience with the idea of "let's put in some
> hooks and then great things will happen".  It would be far better all
> around if we supported exactly one, well-chosen, method.  But really
> I still don't see a reason not to let openssl do it for us.

Do we just need to document SSL's NULL encryption option?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow WAL information to recover corrupted pg_controldata

2012-06-14 Thread Amit Kapila

> I guess my first question is: why do we need this?  There are lots of
> things in the TODO list that someone wanted once upon a time, but
> they're not all actually important.  Do you have reason to believe
> that this one is?  It's been six years since that email, so it's worth
> asking if this is actually relevant.

As far as I know the pg_control is not WAL protected, which means if it gets
corrupt due
to any reason (disk crash during flush, so written partially), it might lead
to failure in recovery of database.
So user can use pg_resetxlog to recover the database. Currently pg_resetxlog
works on guessed values for pg_control.
However this implementation can improve the logic that instead of guessing,
it can try to regenerate the values from
WAL. 
This implementation can allow better recovery in certain circumstances.

> The deadline for patches for this CommitFest is today, so I think you
> should target any work you're starting now for the NEXT CommitFest.

Oh, I am sorry, as this was my first time I was not fully aware of the
deadline.

However I still seek your opinion whether it makes sense to work on this
feature. 


-Original Message-
From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: Friday, June 15, 2012 12:40 AM
To: Amit Kapila
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Allow WAL information to recover corrupted
pg_controldata

On Thu, Jun 14, 2012 at 11:39 AM, Amit Kapila 
wrote:
> I am planning to work on the below Todo list item for this CommitFest
> Allow WAL information to recover corrupted pg_controldata
> http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php

The deadline for patches for this CommitFest is today, so I think you
should target any work you're starting now for the NEXT CommitFest.

> I wanted to confirm my understanding about the work involved for this
patch:
> The existing patch has following set of problems:
>    1. Memory leak and linked list code path is not proper
>    2. lock check for if the server is already running, is removed in patch
> which needs to be reverted
>    3. Refactoring of the code.
>
> Apart from above what I understood from the patch is that its intention is
> to generate values for ControlFile using WAL logs when -r option is used.
>
> The change in algorithm from current will be if control file is corrupt
> which essentialy means ReadControlFile() will return False, then it should
> generate values (checkPointCopy, checkPoint, prevCheckPoint, state) using
> WAL if -r option is enabled.
>
> Also for -r option, it doesn't need to call function FindEndOfXLOG() as
the
> that work will be achieved by above point.
>
> It will just rewrite the control file and dont do other resets.
>
>
> The algorithm of restoring the pg_control value from old xlog file:
>    1. Retrieve all of the active xlog files from xlog direcotry into a
list
> by increasing order, according their timeline, log id, segment id.
>    2. Search the list to find the oldest xlog file of the lastest time
line.
>    3. Search the records from the oldest xlog file of latest time line to
> the latest xlog file of latest time line, if the checkpoint record
>       has been found, update the latest checkpoint and previous
checkpoint.
>
>
>
> Apart from above some changes in code will be required after the Xlog
patch
> by Heikki.
>
> Suggest me if my understanding is correct?

I guess my first question is: why do we need this?  There are lots of
things in the TODO list that someone wanted once upon a time, but
they're not all actually important.  Do you have reason to believe
that this one is?  It's been six years since that email, so it's worth
asking if this is actually relevant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: relation metapages

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 5:34 PM, Heikki Linnakangas
 wrote:
> On 14.06.2012 18:01, Robert Haas wrote:
>> What I'm really looking for at this stage of the game is feedback on
>> the design decisions I made.  The intention here is that it should be
>> possible to read old-format heaps and indexes transparently, but that
>> when we create or rewrite a relation, we add a new-style metapage.
>
> That dodges the problem of pg_upgrade, but eventually, we will want to use
> the metapage for something important, and it can't be optional at that point
> anymore. So we will eventually need to deal with pg_upgrade somehow.

Well, the code as I've written deals with pg_upgrade just fine: you
can move your old relation files over and they still work.  What's
missing at present is an efficient way to convert them to the new
format.  If you don't mind the inefficiency, you can use VACUUM FULL
or CLUSTER, and you're there, but obviously we'll want something
better before we start relying on this for anything too critical.  For
indexes, it should be pretty trivial to reduce the requirement from
"rewrite while holding AccessExclusiveLock" to "brief
AccessExclusiveLock".  Everything except GiST already has a metapage,
so you just need to rewrite that page in the new format, which is a
SMOP.  GiST requires moving the existing metapage out to a free page
(or a new page) and writing a metapage pointing to it into block 0,
which is again pretty simple.  Heaps are a bit harder.  We could adopt
your idea of storing a block number in the metablock; any index TIDs
that point to block 0 will be interpreted as pointing to that block
instead.  Then we can basically just relocate block to any free block,
or a new one, as with GiST.  Alternatively, we can read the tuples in
block 0 and reinsert them; vacuum; and then repurpose block 0.  Taking
it even further, we could try to do better than "brief
AccessExclusiveLock".  That might be possible too, especially for
indexes, but I'm not sure that it's necessary, and I haven't thought
through the details.

Even if we just get it down to "brief AccessExclusiveLock", I think we
might also be able to improve the experience by making autovacuum do
conversions automatically.  So, if you pg_upgrade, the first
autovacuum worker that spins through the database will
ConditionalLockAcquire an AccessExclusiveLock on each relation in turn
and try to do the conversion.  If it can't get the lock it'll keep
retrying until it succeeds.  Odds are good that for most people this
would make the addition of the metapage completely transparent.  On
the other hand, if metapages exist mostly to support operations like
making an unlogged table logged or visca versa, that's not really
necessary: we can add the metapage when someone performs a DDL
operation that requires it.  There is a lot of optimization possible
on top of the basic mechanism, but how much of it makes sense to do,
and which parts, depends on exactly which of the many things we could
do with this we end up deciding to actually do.  I'm trying to start
with the basics, which means getting the basic infrastructure in place
and working.

> It would be nice to have the oid of the access method in the metapage (or
> some other way to identify it).

Yeah, I was thinking about that.  I think a magic number might be
preferable to an OID, and we actually already have that as the first
4-bytes of the access method metadata for all index types except GIN.
I'm thinking about trying to fix up GIN so that it adds one as well;
the trick is to do it in a way that is backward-compatible, which I
have an idea how to do but haven't tackled yet.  We can add a magic
number for heaps as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] transforms

2012-06-14 Thread Peter Eisentraut

Here is my first patch for the transforms feature.  This is a mechanism
to adapt data types to procedural languages.  The previous proposal was
here: http://archives.postgresql.org/pgsql-hackers/2012-05/msg00728.php

At the surface, this contains:

- New catalog pg_transform

- CREATE/DROP TRANSFORM

As proof of concepts and hopefully useful applications to-be, I have
included transforms for

- PL/Perl - hstore
- PL/Python - hstore
- PL/Python - ltree

These, along with the documentation for the above-mentioned DDL commands
and catalog can serve as explanations for reviewers.

There are numerous remaining discussion points and possible work items:

*) Currently, each of the above provided transforms is put into a
separate subdirectory under contrib.  This might be OK, but it is mainly
because we cannot build more than one MODULE_big per directory.  Fixing
that might be more work than it's worth, but if we want to provide more
of these in contrib, it might get crowded.

*) Since we have separate extensions for plperl and plperlu, and also
for plpython2u and plpython3u, we need one extension for adapting each
of these to a given type.  You can see under contrib/hstore_plperl what
this looks like.  Maybe this isn't fixable or worth fixing.

(With the directory and packaging not finalized, I haven't included any
documentation for these contrib modules yet.)

*) No psql backslash commands yet.  Could be added.

*) Permissions: Transforms don't have owners, a bit like casts.
Currently, you are allowed to drop a transform if you own both the type
and the language.  That might be too strict, maybe own the type and have
privileges on the language would be enough.

*) If we want to offer the ability to write transforms to end users,
then we need to install all the header files for the languages and the
types.  This actually worked quite well; including hstore.h and plperl.h
for example, gave you what you needed.  In other cases, some headers
might need cleaning up.  Also, some magic from the plperl and plpython
build systems might need to be exposed, for example to find the header
files.  See existing modules for how this currently looks.

*) There is currently some syntax schizophrenia.  The grammar accepts

CREATE TRANSFORM FOR hstore LANGUAGE plperl (
FROM SQL WITH hstore_to_plperl,
TO SQL WITH plperl_to_hstore
);

but pg_dump produces

CREATE TRANSFORM FOR hstore LANGUAGE plperl (
FROM SQL WITH hstore_to_plperl(hstore),
TO SQL WITH plperl_to_hstore(internal)
);

The SQL standard allows both.  (In the same way that it allows 'DROP
FUNCTION foo' without arguments, if it is not ambigious.)  Precedent is
that CREATE CAST requires arguments, but CREATE LANGUAGE does not.

*) The issue of how to handle arguments of type "internal" was already
discussed at
http://archives.postgresql.org/pgsql-hackers/2012-05/msg00857.php and
following.  I have adopted the suggestion there to prohibit calling
functions with arguments of type "internal", but that is probably full
of holes and needs to be reviewed carefully.

*) I'm not very familiar with the Perl C API, so the hstore_plperl
implementation is probably full of memory leaks and weirdnesses.  It
needs some help from a PL/Perl expert.  (The PL/Python stuff is
hopefully better.)

*) ltree_plpython lacks the transform function from python to ltree,
because of laziness and lack of clear documentation on how to construct
ltree values.

*) I just noticed that the grammar changes broke ECPG.  I'll look into
that.

*) The regression tests for the core CREATE/DROP TRANSFORM functionality
is in contrib/hstore_plpython, because the core has no suitable language
available.  It's a bit ugly, but I don't know of a better way short of
implementing a fake language for regression testing only.



transforms-20120615.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 20:32, Robert Haas  wrote:
> Yeah, but *it doesn't matter*.  If you test this on strings that are
> long enough that they get pushed out to TOAST, you'll find that it
> doesn't measurably improve performance, because the overhead of
> detoasting so completely dominates any savings on the palloc side that
> you can't pick them out of the inter-run noise.

That's probably true, but it's also beside the point. As recently as a
few hours ago, you yourself said "my guess is that most values people
sort by are pretty short, making this concern mostly academic". Why
are you getting hung up on toasting now?

> Here we know that it doesn't matter, so the application of Knuth's first law
> of optimization is appropriate.

I'm not advocating some Byzantine optimisation, or even something that
could reasonably be described as an optimisation at all here. I'm
questioning why you've unnecessarily complicated the code by having
the buffer size just big enough to fit the biggest value seen so far,
but arbitrarily aligned to a value that is completely irrelevant to
bttextfastcmp_locale(), rather than using simple geometric expansion,
which is more or less the standard way of managing the growth of a
dynamic array.

You have to grow the array in some way. The basic approach I've
outlined has something to recommend it - why does it make sense to
align the size of the buffer to TEXTBUFLEN in particular though? It's
quite easy to imagine what you've done here resulting in an excessive
number of allocations (and pfree()s), which *could* be expensive. If
you're so conservative about allocating memory, don't grow the array
at quite so aggressive a rate as doubling it each time.

There is a trade-off between space and time to be made here, but I
don't know why you think that the right choice is to use almost the
smallest possible amount of memory in all cases.

>> Another concern is that it seems fairly pointless to have two buffers.
>> Wouldn't it be more sensible to have a single buffer that was
>> partitioned to make two logical, equally-sized buffers, given that in
>> general each buffer is expected to grow at exactly the same rate?
>
> Sure, but it would be making the code more complicated in return for
> no measurable performance benefit.  We generally avoid that.

Fair enough.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format changes

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 11:01:42 PM Heikki Linnakangas wrote:
> As I threatened earlier
> (http://archives.postgresql.org/message-id/4fd0b1ab.3090...@enterprisedb.co
> m), here are three patches that change the WAL format. The goal is to
> change the format so that when you're inserting a WAL record of a given
> size, you know exactly how much space it requires in the WAL.
> 
> 1. Use a 64-bit segment number, instead of the log/seg combination. And
> don't waste the last segment on each logical 4 GB log file. The concept
> of a "logical log file" is now completely gone. XLogRecPtr is unchanged,
> but it should now be understood as a plain 64-bit value, just split into
> two 32-bit integers for historical reasons. On disk, this means that
> there will be log files ending in FF, those were skipped before.
Whats the reason for keeping that awkward split now? There aren't that many 
users of xlogid/xcrecoff and many of those would be better served by using 
helper macros.
API compatibility isn't a great argument either as code manually playing 
around with those needs to be checked anyway. I think there might be some code 
around that does XLogRecPtr addition manuall and such.

> 2. Always include the xl_rem_len field, used for continuation records,
> in the xlog page header. A continuation log record only contained that
> one field, it's now included straight in the page header, so the concept
> of a continuation record doesn't exist anymore. Because of alignment,
> this wastes 4 bytes on every page that contains continued data from a
> previous record, and 8 bytes on pages that don't. That's not very much,
> and the next step will buy that back:
> 
> 3. Allow WAL record header to be split across pages. Per Tom's
> suggestion, move xl_tot_len to be the first field in XLogRecord, so that
> even if the header is split, xl_tot_len is always on the first page.
> xl_crc is moved to be the last field, and xl_prev is the second to last.
> This has the advantage that you can calculate the CRC for all the other
> fields before acquiring WALInsertLock. For xl_prev, you need to know
> where exactly the record is inserted, so it's handy that it's the last
> field before CRC. This patch doesn't try to take advantage of that,
> however, and I'm not sure if that makes any difference once I finish the
> patch to make XLogInsert scale better, which is the ultimate goal of all
> this.
> 
> Those are the three patches I'd like to get committed in this
> commitfest. To see where all this is leading to, I've included a rough
> WIP version of the XLogInsert scaling patch. This version is quite
> different from the one I posted in spring, it takes advantage of the WAL
> format changes, and I'm also experimenting with a different method of
> tracking how far each WAL insertion has progressed. But more on that later.
> 
> (Note to self: remember to bump XLOG_PAGE_MAGIC)
Will review.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format changes

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 11:01:42 PM Heikki Linnakangas wrote:
> As I threatened earlier
> (http://archives.postgresql.org/message-id/4fd0b1ab.3090...@enterprisedb.co
> m), here are three patches that change the WAL format. The goal is to
> change the format so that when you're inserting a WAL record of a given
> size, you know exactly how much space it requires in the WAL.
I fear the patches need rebasing after the pgindent run... Even before that 
(60801944fa105252b48ea5688d47dfc05c695042) it only applies with offsets?

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 06/16] Add support for a generic wal reading facility dubbed XLogReader

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 11:19:00 PM Heikki Linnakangas wrote:
> On 13.06.2012 14:28, Andres Freund wrote:
> > Features:
> > - streaming reading/writing
> > - filtering
> > - reassembly of records
> > 
> > Reusing the ReadRecord infrastructure in situations where the code that
> > wants to do so is not tightly integrated into xlog.c is rather hard and
> > would require changes to rather integral parts of the recovery code
> > which doesn't seem to be a good idea.
> 
> It would be nice refactor ReadRecord and its subroutines out of xlog.c.
> That file has grown over the years to be really huge, and separating the
> code to read WAL sounds like it should be a pretty natural split. I
> don't want to duplicate all the WAL reading code, so we really should
> find a way to reuse that. I'd suggest rewriting ReadRecord into a thin
> wrapper that just calls the new xlogreader code.
> 
> > Missing:
> > - "compressing" the stream when removing uninteresting records
> > - writing out correct CRCs
> > - validating CRCs
> > - separating reader/writer
> 
> - comments.
> 
> At a quick glance, I couldn't figure out how this works. There seems to
> be some callback functions? If you want to read an xlog stream using
> this facility, what do you do? Can this be used for writing WAL, as well
> as reading? If so, what do you need the write support for?
Oh, btw, the callbacks and parameters are somewhat documented in the 
xlogreader.h header in the XLogReaderState struct.
Still needs improvement though.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] measuring spinning

2012-06-14 Thread Robert Haas

On Wed, Jan 11, 2012 at 8:48 PM, Robert Haas  wrote:
> I've had cause, a few times this development cycle, to want to measure
> the amount of spinning on each lwlock in the system.  To that end,
> I've found the attached patch useful.  Note that if you don't define
> LWLOCK_STATS, this changes nothing except that the return value from
> s_lock becomes int rather than void.  If you do define LWLOCK_STATS,
> then LWLockAcquire() counts the number of pg_usleep() calls that are
> required to acquire each LWLock, in addition to the other statistics.
> Since this has come up for me a few times now, I'd like to proposing
> including it in core.

Well, this fell through the cracks, because I forgot to add it to the
January CommitFest.  Here it is again, rebased.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


spindebug-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 06/16] Add support for a generic wal reading facility dubbed XLogReader

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 11:19:00 PM Heikki Linnakangas wrote:
> On 13.06.2012 14:28, Andres Freund wrote:
> > Features:
> > - streaming reading/writing
> > - filtering
> > - reassembly of records
> > 
> > Reusing the ReadRecord infrastructure in situations where the code that
> > wants to do so is not tightly integrated into xlog.c is rather hard and
> > would require changes to rather integral parts of the recovery code
> > which doesn't seem to be a good idea.
> It would be nice refactor ReadRecord and its subroutines out of xlog.c.
> That file has grown over the years to be really huge, and separating the
> code to read WAL sounds like it should be a pretty natural split. I
> don't want to duplicate all the WAL reading code, so we really should
> find a way to reuse that. I'd suggest rewriting ReadRecord into a thin
> wrapper that just calls the new xlogreader code.
I aggree that it is not very nice to duplicate it. But I also don't want to go 
the route of replacing ReadRecord with it for a while, we can replace 
ReadRecord later if we want. As long as it is in flux like it is right now I 
don't really see the point in investing energy in it.
Also I am not that sure how a callback oriented API fits into the xlog.c 
workflow?

> > Missing:
> > - "compressing" the stream when removing uninteresting records
> > - writing out correct CRCs
> > - validating CRCs
> > - separating reader/writer
> 
> - comments.
> At a quick glance, I couldn't figure out how this works. There seems to
> be some callback functions? If you want to read an xlog stream using
> this facility, what do you do?
You currently have to fill out 4 callbacks:

XLogReaderStateInterestingCB is_record_interesting;
XLogReaderStateWriteoutCB writeout_data;
XLogReaderStateFinishedRecordCB finished_record;
XLogReaderStateReadPageCB read_page;

As an example how to use it (from the walsender support for 
START_LOGICAL_REPLICATION):

if(!xlogreader_state){
xlogreader_state = XLogReaderAllocate();
xlogreader_state->is_record_interesting = 
RecordRelevantForLogicalReplication;
xlogreader_state->finished_record = ProcessRecord;
xlogreader_state->writeout_data = WriteoutData;
xlogreader_state->read_page = XLogReadPage;

/* startptr is the current XLog position */
xlogreader_state->startptr = startptr;

XLogReaderReset(xlogreader_state);
}

/* how far does valid data go */
xlogreader_state->endptr = endptr;

XLogReaderRead(xlogreader_state);

The last step will then call the above callbacks till it reaches endptr. I.e. 
it first reads a page with "read_page"; then checks whether a record is 
interesting for the use-case ("is_record_interesting"); in case it is 
interesting, it gets reassembled and passed to the "finished_record" callback. 
Then the bytestream gets written out again with "writeout_data".

In this case it gets written to the buffer the walsender has allocated. In 
others it might just get thrown away.

> Can this be used for writing WAL, as well as reading? If so, what do you
> need the write support for?
It currently can replace records which are not interesting (e.g. index changes 
in the case of logical rep). Filtered records are replaced with XLOG_NOOP 
records with correct length currently. In future the actual amount of data 
should really be reduced. I don't know yet know how to map LSNs of 
uncompressed/compressed stream onto each other...
The filtered data is then passed to a writeout callback (in a streaming 
fashion).

The whole writing out part is pretty ugly at the moment and I just bolted it 
ontop because it was convenient for the moment. I am not yet sure how the api 
for that should look

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: relation metapages

2012-06-14 Thread Heikki Linnakangas


On 14.06.2012 18:01, Robert Haas wrote:

What I'm really looking for at this stage of the game is feedback on
the design decisions I made.  The intention here is that it should be
possible to read old-format heaps and indexes transparently, but that
when we create or rewrite a relation, we add a new-style metapage.


That dodges the problem of pg_upgrade, but eventually, we will want to 
use the metapage for something important, and it can't be optional at 
that point anymore. So we will eventually need to deal with pg_upgrade 
somehow.



I put the new metapage code in src/backend/access/common/metapage.c,
but I don't have a lot of confidence that that's the appropriate
location for it.  Suggestions are appreciated.


That seems good to me.

It would be nice to have the oid of the access method in the metapage 
(or some other way to identify it).


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 06/16] Add support for a generic wal reading facility dubbed XLogReader

2012-06-14 Thread Heikki Linnakangas


On 13.06.2012 14:28, Andres Freund wrote:

Features:
- streaming reading/writing
- filtering
- reassembly of records

Reusing the ReadRecord infrastructure in situations where the code that wants
to do so is not tightly integrated into xlog.c is rather hard and would require
changes to rather integral parts of the recovery code which doesn't seem to be
a good idea.


It would be nice refactor ReadRecord and its subroutines out of xlog.c. 
That file has grown over the years to be really huge, and separating the 
code to read WAL sounds like it should be a pretty natural split. I 
don't want to duplicate all the WAL reading code, so we really should 
find a way to reuse that. I'd suggest rewriting ReadRecord into a thin 
wrapper that just calls the new xlogreader code.



Missing:
- "compressing" the stream when removing uninteresting records
- writing out correct CRCs
- validating CRCs
- separating reader/writer


- comments.

At a quick glance, I couldn't figure out how this works. There seems to 
be some callback functions? If you want to read an xlog stream using 
this facility, what do you do? Can this be used for writing WAL, as well 
as reading? If so, what do you need the write support for?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode

2012-06-14 Thread Christopher Browne

On Thu, Jun 14, 2012 at 5:00 PM, Robert Haas  wrote:
> On Thu, Jun 14, 2012 at 4:51 PM, Andres Freund  wrote:
>> Lets sidetrack this till we have a tender agreement on how to handle DDL ;). 
>> I
>> am aware of the issues with rollbacks, truncate et al...
>
> Agreed; I will write up my thoughts about DDL on the other thread.  I
> think that's a key thing we need to figure out; once we understand how
> we're handling that, the correct design for this will probably fall
> out pretty naturally.

I wonder if *an* answer (if not forcibly a perfect one) is to provide
a capturable injection point.

A possible shape of this would be to have a function to which you pass
a DDL statement, at which point, it does two things:
a) Runs the DDL, to make the requested change, and
b) Captures the DDL in a convenient form in the WAL log so that it may
be detected and replayed at the right point in processing.

That's not a solution for capturing DDL automatically, but it's
something that would be useful-ish even today, for systems like Slony
and Londiste, and this would be natural to connect to Dimitri's "Event
Triggers."

That also fits with the desire to have components that are (at least
hopefully) usable to other existing replication systems.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 4:51 PM, Andres Freund  wrote:
> Lets sidetrack this till we have a tender agreement on how to handle DDL ;). I
> am aware of the issues with rollbacks, truncate et al...

Agreed; I will write up my thoughts about DDL on the other thread.  I
think that's a key thing we need to figure out; once we understand how
we're handling that, the correct design for this will probably fall
out pretty naturally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 08:50:51 PM Robert Haas wrote:
> On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund  
wrote:
> > This patch is problematic because formally indexes used by syscaches
> > needs to be unique, this one is not though because of 0/InvalidOids
> > entries for nailed/shared catalog entries. Those values aren't allowed
> > to be queried though.
> That's not the only reason it's not unique.  Take a look at
> GetRelFileNode().  We really only guarantee that  tablespace OID, relfilenode, backend-ID>, taken as a four-tuple, is
> unique.  You could have the same relfilenode in different tablespaces,
> or even within the same tablespace with different backend-IDs.  The
> latter might not matter for you because you're presumably disregarding
> temp tables, but the former probably does.  It's an uncommon scenario
> because we normally set relid = relfilenode, and of course relid is
> unique across the database, but the table gets rewritten then you end
> up with relid != relfilenode, and I don't think there's anything at
> that point that will prevent the new relfilenode from being chosen as
> some other relations relfilenode, as long as it's in a different
> tablespace.
> 
> I think the solution may be to create a specialized cache for this,
> rather than relying on the general syscache infrastructure.  You might
> look at, e.g., attoptcache.c for an example.  That would allow you to
> build a cache that is aware of things like the relmapper
> infrastructure, and the fact that temp tables are ignorable for your
> purposes.  But I think you will need to include at least the
> tablespace OID in the key along with the relfilenode to make it
> bullet-proof.
Yes, the tablespace OID should definitely be in there. Need to read up on the 
details of an own cache. Once more I didn't want to put in more work before 
discussing it here.

> I haven't read through the patch series far enough to know what this
> is being used for yet, but my fear is that you're using it to handle
> mapping a relflenode extracted from the WAL stream back to a relation
> OID.  The problem with that is that relfilenode assignments obey
> transaction semantics.  So, if someone begins a transaction, truncates
> a table, inserts a tuple, and commits, the heap_insert record is going
> to refer to a relfilenode that, according to the system catalogs,
> doesn't exist.  This is similar to one of the worries in my other
> email, so I won't belabor the point too much more here...
Well, yes. We *need* to do the mapping back from the relfilenode to a table. 
The idea is that by the fact that the receiving side, be it a full cluster or 
just a catalog one, has a fully synchronized catalog in which DDL gets applied 
correctly, inside the transaction just as in the sending side, it should never 
be wrong to do that mapping.
It probably is necessary to make the syscache lookup/infrastructure to use an 
MVCCish snapshot though. No idea how hard that would be yet. Might be a good 
argument for your argument of using a specialized cache.

Lets sidetrack this till we have a tender agreement on how to handle DDL ;). I 
am aware of the issues with rollbacks, truncate et al...

Thanks,

Andres

-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] Logical Replication/BDR prototype and architecture

2012-06-14 Thread Andres Freund

Hi Robert,

Thanks for your answer.

On Thursday, June 14, 2012 06:17:26 PM Robert Haas wrote:
> On Wed, Jun 13, 2012 at 7:27 AM, Andres Freund  
wrote:
> > === Design goals for logical replication === :
> > - in core
> > - fast
> > - async
> > - robust
> > - multi-master
> > - modular
> > - as unintrusive as possible implementation wise
> > - basis for other technologies (sharding, replication into other DBMSs,
> > ...)
> 
> I agree with all of these goals except for "multi-master".  I am not
> sure that there is a need to have a multi-master replication solution
> in core.  The big tricky part of multi-master replication is conflict
> resolution, and that seems like an awful lot of logic to try to build
> into core - especially given that we will want it to be extensible.
I don't plan to throw in loads of conflict resolution smarts. The aim is to get 
to the place where all the infrastructure is there so that a MM solution can 
be built by basically plugging in a conflict resolution mechanism. Maybe 
providing a very simple one.
I think without in-core support its really, really hard to build a sensible MM 
implementation. Which doesn't mean it has to live entirely in core.

Loads of the use-cases we have seen lately have a relatively small, low-
conflict shared dataset and a far bigger sharded one. While that obviously 
isn't the only relevant use case it is a senible important one.

> More generally, I would much rather see us focus on efficiently
> extracting changesets from WAL and efficiently applying those
> changesets on another server.  IMHO, those are the things that are
> holding back the not-in-core replication solutions we have today,
> particularly the first.  If we come up with a better way of applying
> change-sets, well, Slony can implement that too; they are already
> installing C code.  What neither they nor any other non-core solution
> can implement is change-set extraction, and therefore I think that
> ought to be the focus.
It definitely is a very important focus. I don't think it is the only one.  But 
that doesn't seem to be a problem to me as long as everything is kept fairly 
modular (which I tried rather hard to).

> To put all that another way, I think it is a 100% bad idea to try to
> kill off Slony, Bucardo, Londiste, or any of the many home-grown
> solutions that are out there to do replication.  Even if there were no
> technical place for third-party replication products (and I think
> there is), we will not win many friends by making it harder to extend
> and add on to the server.  If we build an in-core replication solution
> that can be used all by itself, that is fine with me.  But I think it
> should also be able to expose its constituent parts as a toolkit for
> third-party solutions.
I agree 100%. Unfortunately I forgot to explictly make that point, but the 
plan is definitely is to make the life of other replication solutions easier 
not harder. I don't think there will ever be one replication solution that fits 
every use-case perfectly.
At pgcon I talked with some of the slony guys and they were definetly 
interested in the changeset generation and I have kept that in mind. If some 
problems that need resolving indepently of that issue is resolved (namely DDL) 
it shouldn't take much generating their output format. The 'apply' code is 
fully abstracted and separted.

> > While you may argue that most of the above design goals are already
> > provided by various trigger based replication solutions like Londiste or
> > Slony, we think that thats not enough for various reasons:
> > 
> > - not in core (and thus less trustworthy)
> > - duplication of writes due to an additional log
> > - performance in general (check the end of the above presentation)
> > - complex to use because there is no native administration interface
> 
> I think that your parenthetical note "(and thus less trustworthy)"
> gets at another very important point, which is that one of the
> standards for inclusion in core is that it must in fact be trustworthy
> enough to justify the confidence that users will place in it.  It will
> NOT benefit the project to have two replication solutions in core, one
> of which is crappy.  More, even if what we put in core is AS GOOD as
> the best third-party solutions that are available, I don't think
> that's adequate.  It has to be better.  If it isn't, there is no
> excuse for preempting what's already out there.
I aggree that it has to be very good. *But* I think it is totally acceptable 
if it doesn't have all the bells and whistles from the start. That would be a 
sure road to disaster. For one implementing all that takes time and for 
another the amount of discussions till we are there is rather huge.

> I imagine you are thinking along similar lines, but I think that it
> bears being explicit about.
Seems like were thinking along the same lines, yes.

> > The biggest problem is, that interpreting tuples in the WAL stream
> > requires an up-to-date system

Re: [HACKERS] libpq compression

2012-06-14 Thread k...@rice.edu

On Thu, Jun 14, 2012 at 02:38:02PM -0500, Merlin Moncure wrote:
> On Thu, Jun 14, 2012 at 1:43 PM, Tom Lane  wrote:
> > So I've got very little patience with the idea of "let's put in some
> > hooks and then great things will happen".  It would be far better all
> > around if we supported exactly one, well-chosen, method.  But really
> > I still don't see a reason not to let openssl do it for us.
> 
> Well, for toast compression the right choice is definitely one of the
> lz based algorithms (not libz!).  For transport compression you have
> the case of sending large data over very slow and/or expensive links
> in which case you want to use bzip type methods.  But in the majority
> of cases I'd probably be using lz there too.  So if I had to pick just
> one, there you go.  But which one? the lz algorithm with arguably the
> best pedigree (lzo) is gnu but there are many other decent candidates,
> some of which have really tiny implementations.
> 
> merlin
> 

+1 for a very fast compressor/de-compressor. lz4 from Google has
a BSD license and at 8.5X faster compression than zlib(-1) and
5X faster de-compression the zlib (-1), 2X faster than LZO even
would be my pick. 

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Merlin Moncure

On Thu, Jun 14, 2012 at 1:43 PM, Tom Lane  wrote:
> So I've got very little patience with the idea of "let's put in some
> hooks and then great things will happen".  It would be far better all
> around if we supported exactly one, well-chosen, method.  But really
> I still don't see a reason not to let openssl do it for us.

Well, for toast compression the right choice is definitely one of the
lz based algorithms (not libz!).  For transport compression you have
the case of sending large data over very slow and/or expensive links
in which case you want to use bzip type methods.  But in the majority
of cases I'd probably be using lz there too.  So if I had to pick just
one, there you go.  But which one? the lz algorithm with arguably the
best pedigree (lzo) is gnu but there are many other decent candidates,
some of which have really tiny implementations.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Gurjeet Singh

On Thu, Jun 14, 2012 at 3:20 PM, Robert Haas  wrote:

> Also, you had
> the same primary content for both comments, and you incorrectly
> reversed the sense of the FreeFile() test.
>

I am known for my fat fingers :)


> Committed with those changes.
>

Thanks!
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 3:24 PM, Peter Geoghegan  wrote:
> On 14 June 2012 19:28, Robert Haas  wrote:
>> I thought that doubling repeatedly would be overly aggressive in terms
>> of memory usage.  Blowing the buffers out to 8kB because we hit a
>> string that's a bit over 4kB isn't so bad, but blowing them out to
>> 256MB because we hit a string that's a bit over 128MB seems a bit
>> excessive.
>
> That's pretty much what all popular dynamic array implementations do,
> from C++'s std::vector to Python's list (it's a misnomer). Having to
> allocate 256MB for a buffer to contain a string a bit over 128MB may
> seem excessive, until you later get a 250MB string. Even if doubling
> is generally excessive, which I doubt, that's beside the point, which
> is that expanding the array by some constant proportion results in
> each insertion taking amortized constant time.

Yeah, but *it doesn't matter*.  If you test this on strings that are
long enough that they get pushed out to TOAST, you'll find that it
doesn't measurably improve performance, because the overhead of
detoasting so completely dominates any savings on the palloc side that
you can't pick them out of the inter-run noise.  Risking eating up an
extra 100MB of memory that we don't really need in order to obtain a
performance optimization that is far too small to measure does not
make sense.  The case with std::vector is not analagous; they don't
have any way of knowing what other overhead you are incurring between
insertions into the vector, so it's reasonable to support that the
cost of the vector insertions themselves might be material.  Here we
know that it doesn't matter, so the application of Knuth's first law
of optimization is appropriate.

>> Suppose we expand the buffer a
>> kilobyte at a time from an initial size of 1kB all the way out to
>> 256MB.  That's 256,000 palloc calls, so we must be sorting at least
>> 256,000 datums, at least 128,000 of which are longer than 128MB.  I
>> think the cost of calling memcpy() and strcoll() repeatedly on all
>> those long datums - not to mention repeatedly detoasting them - is
>> going to bludgeon the palloc overhead into complete insignificance.
>
> I fail to understand how this sortsupport buffer fundamentally differs
> from a generic dynamic array abstraction built to contain chars. That
> being the case, I see no reason not to just do what everyone else does
> when expanding dynamic arrays, and no reason why we shouldn't make
> essentially the same time-space trade-off here as others do elsewhere.
>
> Another concern is that it seems fairly pointless to have two buffers.
> Wouldn't it be more sensible to have a single buffer that was
> partitioned to make two logical, equally-sized buffers, given that in
> general each buffer is expected to grow at exactly the same rate?

Sure, but it would be making the code more complicated in return for
no measurable performance benefit.  We generally avoid that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 19:28, Robert Haas  wrote:
> I thought that doubling repeatedly would be overly aggressive in terms
> of memory usage.  Blowing the buffers out to 8kB because we hit a
> string that's a bit over 4kB isn't so bad, but blowing them out to
> 256MB because we hit a string that's a bit over 128MB seems a bit
> excessive.

That's pretty much what all popular dynamic array implementations do,
from C++'s std::vector to Python's list (it's a misnomer). Having to
allocate 256MB for a buffer to contain a string a bit over 128MB may
seem excessive, until you later get a 250MB string. Even if doubling
is generally excessive, which I doubt, that's beside the point, which
is that expanding the array by some constant proportion results in
each insertion taking amortized constant time.

> Suppose we expand the buffer a
> kilobyte at a time from an initial size of 1kB all the way out to
> 256MB.  That's 256,000 palloc calls, so we must be sorting at least
> 256,000 datums, at least 128,000 of which are longer than 128MB.  I
> think the cost of calling memcpy() and strcoll() repeatedly on all
> those long datums - not to mention repeatedly detoasting them - is
> going to bludgeon the palloc overhead into complete insignificance.

I fail to understand how this sortsupport buffer fundamentally differs
from a generic dynamic array abstraction built to contain chars. That
being the case, I see no reason not to just do what everyone else does
when expanding dynamic arrays, and no reason why we shouldn't make
essentially the same time-space trade-off here as others do elsewhere.

Another concern is that it seems fairly pointless to have two buffers.
Wouldn't it be more sensible to have a single buffer that was
partitioned to make two logical, equally-sized buffers, given that in
general each buffer is expected to grow at exactly the same rate?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 3:10 PM, Gurjeet Singh  wrote:
>> Well, according to the comments for AllocateFile:
>>
>>  * fd.c will automatically close all files opened with AllocateFile at
>>  * transaction commit or abort; this prevents FD leakage if a routine
>>  * that calls AllocateFile is terminated prematurely by ereport(ERROR).
>
> I bet anyone else looking at this code, who is not in the know, will trip
> over this again.
>
> Another problem with that code block is that it will throw "could not read"
> even though read has succeeded but FreeFile() failed.
>
> I say we break it into two blocks, one to handle read error, and then close
> the file separately. Also, either make sure FreeFile() is called in all code
> paths, or not call FreeFile() at all and reference to the comment above
> AllocateFile().
>
> Patch attached.

I agree with breaking it into two blocks, but I don't agree that the
comments need to recapitulate how AllocateFile works.  Also, you had
the same primary content for both comments, and you incorrectly
reversed the sense of the FreeFile() test.

Committed with those changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COMMENT on function's arguments

2012-06-14 Thread Robert Haas

On Tue, Jun 12, 2012 at 10:59 PM, Vlad Arkhipov  wrote:
> Does it make sense to have a comment on function's arguments? Of course it
> is possible to include these comments in a function's comment, but may be
> better to have them in more formalized way like comments on columns of a
> table. IDEs may use this information when providing hints for a function
> like in other languages.

This would be somewhat tricky, because our COMMENT support assumes
that the object upon which we're commenting has an ObjectAddress, and
individual arguments to a function don't, although perhaps the
sub-object-id stuff that we currently use to handle comments on table
columns could be extended to handle this case.  I guess I wouldn't
object to a well-done patch that made this work, but creating such a
patch seems likely to be tricky, owing to the fact that there's
nothing in the system that thinks of the individual arguments to a
function as separate objects at present.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Gurjeet Singh

On Thu, Jun 14, 2012 at 2:02 PM, Robert Haas  wrote:

> On Thu, Jun 14, 2012 at 1:48 PM, Gurjeet Singh 
> wrote:
> > On Thu, Jun 14, 2012 at 1:29 PM, Robert Haas 
> wrote:
> >> Committed.
> >
> >
> > A minor gripe:
> >
> > + /*
> > + * Close the backup label file.
> > + */
> > + if (ferror(lfp) || FreeFile(lfp)) {
> > +ereport(ERROR,
> > +(errcode_for_file_access(),
> > +errmsg("could not read file \"%s\": %m",
> > + BACKUP_LABEL_FILE)));
> > + }
> > +
> >
> > If ferror(lfp) returns false, wouldn't we miss the FreeFile() and leak a
> > file pointer?
>
> Well, according to the comments for AllocateFile:
>
>  * fd.c will automatically close all files opened with AllocateFile at
>  * transaction commit or abort; this prevents FD leakage if a routine
>  * that calls AllocateFile is terminated prematurely by ereport(ERROR).
>

I bet anyone else looking at this code, who is not in the know, will trip
over this again.

Another problem with that code block is that it will throw "could not read"
even though read has succeeded but FreeFile() failed.

I say we break it into two blocks, one to handle read error, and then close
the file separately. Also, either make sure FreeFile() is called in all
code paths, or not call FreeFile() at all and reference to the comment
above AllocateFile().

Patch attached.

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

FreeFile.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow WAL information to recover corrupted pg_controldata

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 11:39 AM, Amit Kapila  wrote:
> I am planning to work on the below Todo list item for this CommitFest
> Allow WAL information to recover corrupted pg_controldata
> http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php

The deadline for patches for this CommitFest is today, so I think you
should target any work you're starting now for the NEXT CommitFest.

> I wanted to confirm my understanding about the work involved for this patch:
> The existing patch has following set of problems:
>    1. Memory leak and linked list code path is not proper
>    2. lock check for if the server is already running, is removed in patch
> which needs to be reverted
>    3. Refactoring of the code.
>
> Apart from above what I understood from the patch is that its intention is
> to generate values for ControlFile using WAL logs when -r option is used.
>
> The change in algorithm from current will be if control file is corrupt
> which essentialy means ReadControlFile() will return False, then it should
> generate values (checkPointCopy, checkPoint, prevCheckPoint, state) using
> WAL if -r option is enabled.
>
> Also for -r option, it doesn't need to call function FindEndOfXLOG() as the
> that work will be achieved by above point.
>
> It will just rewrite the control file and don’t do other resets.
>
>
> The algorithm of restoring the pg_control value from old xlog file:
>    1. Retrieve all of the active xlog files from xlog direcotry into a list
> by increasing order, according their timeline, log id, segment id.
>    2. Search the list to find the oldest xlog file of the lastest time line.
>    3. Search the records from the oldest xlog file of latest time line to
> the latest xlog file of latest time line, if the checkpoint record
>       has been found, update the latest checkpoint and previous checkpoint.
>
>
>
> Apart from above some changes in code will be required after the Xlog patch
> by Heikki.
>
> Suggest me if my understanding is correct?

I guess my first question is: why do we need this?  There are lots of
things in the TODO list that someone wanted once upon a time, but
they're not all actually important.  Do you have reason to believe
that this one is?  It's been six years since that email, so it's worth
asking if this is actually relevant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 03/16] Add a new syscache to fetch a pg_class entry via its relfilenode

2012-06-14 Thread Robert Haas

On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund  wrote:
> This patch is problematic because formally indexes used by syscaches needs to
> be unique, this one is not though because of 0/InvalidOids entries for
> nailed/shared catalog entries. Those values aren't allowed to be queried 
> though.

That's not the only reason it's not unique.  Take a look at
GetRelFileNode().  We really only guarantee that , taken as a four-tuple, is
unique.  You could have the same relfilenode in different tablespaces,
or even within the same tablespace with different backend-IDs.  The
latter might not matter for you because you're presumably disregarding
temp tables, but the former probably does.  It's an uncommon scenario
because we normally set relid = relfilenode, and of course relid is
unique across the database, but the table gets rewritten then you end
up with relid != relfilenode, and I don't think there's anything at
that point that will prevent the new relfilenode from being chosen as
some other relations relfilenode, as long as it's in a different
tablespace.

I think the solution may be to create a specialized cache for this,
rather than relying on the general syscache infrastructure.  You might
look at, e.g., attoptcache.c for an example.  That would allow you to
build a cache that is aware of things like the relmapper
infrastructure, and the fact that temp tables are ignorable for your
purposes.  But I think you will need to include at least the
tablespace OID in the key along with the relfilenode to make it
bullet-proof.

I haven't read through the patch series far enough to know what this
is being used for yet, but my fear is that you're using it to handle
mapping a relflenode extracted from the WAL stream back to a relation
OID.  The problem with that is that relfilenode assignments obey
transaction semantics.  So, if someone begins a transaction, truncates
a table, inserts a tuple, and commits, the heap_insert record is going
to refer to a relfilenode that, according to the system catalogs,
doesn't exist.  This is similar to one of the worries in my other
email, so I won't belabor the point too much more here...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Tom Lane

Euler Taveira  writes:
> On 14-06-2012 02:19, Tom Lane wrote:
>> ...  I especially think that importing bzip2
>> is a pretty bad idea --- it's not only a new dependency, but bzip2's
>> compression versus speed tradeoff is entirely inappropriate for this
>> use-case.

> I see, the idea is that bzip2 would be a compiled-in option (not enabled by
> default) just to give another compression option.

I'm not particularly thrilled with "let's have more compression options
just to have options".  Each such option you add is another source of
fail-to-connect incompatibilities (when either the client or the server
doesn't have it).  Moreover, while there are a lot of compression
algorithms out there, a lot of them are completely unsuited for this
use-case.  If memory serves, bzip2 for example requires fairly large
data blocks in order to get decent compression, which means you are
either going to get terrible compression or suffer very bad latency
when trying to apply it to a connection data stream.

So I've got very little patience with the idea of "let's put in some
hooks and then great things will happen".  It would be far better all
around if we supported exactly one, well-chosen, method.  But really
I still don't see a reason not to let openssl do it for us.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 2:05 PM, Fujii Masao  wrote:
> On Fri, Jun 15, 2012 at 2:29 AM, Robert Haas  wrote:
>> On Thu, Jun 14, 2012 at 6:06 AM, Gabriele Bartolini
>>  wrote:
>>>    thank you very much for your patience (and thank you Marco for supporting
>>> me). I apologise for the delay.
>>>
>>>    I have retested the updated patch and it works fine with me. It is "ready
>>> for committer" for me.
>>
>> Committed.
>
> I think the attached patch needs to be applied.

Thanks, committed.

(Somebody should really give you your own commit bit.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 2:10 PM, Peter Geoghegan  wrote:
> Why have you made the reusable buffer managed by sortsupport
> TEXTBUFLEN-aligned? The existing rationale for that constant (whose
> value is 1024) does not seem to carry forward here:
>
>  * This should be large enough that most strings will be fit, but small
>  * enough that we feel comfortable putting it on the stack.

Well, as the comments explain:

+   /*
+* We avoid repeated palloc/pfree on long strings by keeping the buffers
+* we allocate around for the duration of the sort.  When we
expand them,
+* we round off the to the next multiple of TEXTBUFLEN in order to avoid
+* repeatedly expanding them by very small amounts.
+*/

> ISTM it would be on average worth the hit of having to repalloc a few
> more times for larger strings by making that buffer much smaller
> initially, and doubling its size each time that proved insufficient,
> rather than increasing its size to the smallest possible
> TEXTBUFLEN-aligned size that you can get away with for the immediately
> subsequent memcpy. Realistically, any database I've ever worked with
> would probably be able to fit a large majority of its text strings
> into 16 chars of memory - you yourself said that sorting toasted text
> isn't at all common.

I thought that doubling repeatedly would be overly aggressive in terms
of memory usage.  Blowing the buffers out to 8kB because we hit a
string that's a bit over 4kB isn't so bad, but blowing them out to
256MB because we hit a string that's a bit over 128MB seems a bit
excessive.

Also, I don't think it really saves anything from a performance point
of view.  The worst case for this is if we're asked to repeatedly
compare strings that expand the buffer by a kilobyte each time.
First, how likely is that to happen in a real world test case?  In
many cases, most of the input strings will be of approximately equal
length; also in many cases, that length will be short; even if the
lengths take the worst possible distribution, we have to hit them in
the worst possible order for this to be a problem.  Second, even if it
does happen, does it really matter?  Suppose we expand the buffer a
kilobyte at a time from an initial size of 1kB all the way out to
256MB.  That's 256,000 palloc calls, so we must be sorting at least
256,000 datums, at least 128,000 of which are longer than 128MB.  I
think the cost of calling memcpy() and strcoll() repeatedly on all
those long datums - not to mention repeatedly detoasting them - is
going to bludgeon the palloc overhead into complete insignificance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 1:56 PM, Peter Geoghegan  wrote:
> On 14 June 2012 17:35, Robert Haas  wrote:
>> The problem with pre-detoasting to save comparison cycles is that you
>> can now fit many, many fewer tuples in work_mem.  There might be cases
>> where it wins (for example, because the entire data set fits even
>> after decompressing everything) but in most cases it seems like a
>> loser.
>
> If I had to guess, I'd say you're probably right about that -
> optimising sorting toasted text doesn't seem like a terribly sensible
> use of your time.
>
> What about the strxfrm suggestion of Greg's? You might find that the
> added benefit of being able to avail of a highly optimised strcmp()
> tipped the balance in favour of that idea, beyond the simple fact that
> there's only a linear number of what you might loosely call "strcoll_l
> units of work" rather than as many as O(n ^ 2). Furthermore, I'd
> speculate that if you were to interlace the strxfrm() calls with
> copying each text string, somewhere like within a specialised
> datumCopy(), that would make the approach more efficient still, as you
> specify a location for the blob in the just-palloc()'d leading-key
> private memory directly, rather than just using memcpy.

Well, it's still got the problem of blowing up memory usage.  I just
can't get excited about optimizing for the case where we can consume
10x the memory and still fit in work_mem.  If we've got that case, the
sort is gonna be pretty fast anyway.   The case where preprocessing
wins is when there are going to be a large number of comparisons
against each tuple - i.e. lg(N) is large.  But the cases where we
could pre-transform with strxfrm are those where the data fits in a
small percentage of work mem - i.e. lg(N) is small.  I'm open to
somebody showing up with a test result that demonstrates that it's
worthwhile, but to me it seems like it's chasing diminishing returns.

The point of this patch isn't really to improve things for the
collation-aware case, although it's nice that it does.  The point is
rather to shave off a double-digit percentage off the time it takes to
do the sort required to build a C-collation index, which is what
people should be using when they don't care about < and >, which most
don't.  Despite Tom's concerns, I don't think there's anything in this
patch that can't be fairly easily revised at a later date if we decide
we want a different API.  I think it's worth picking the low-hanging
fruit in the meantime.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 2 March 2012 20:45, Robert Haas  wrote:
> I decided to investigate the possible virtues of allowing "text" to
> use the sortsupport infrastructure, since strings are something people
> often want to sort.

I should mention up-front that I agree with the idea that it is worth
optimising text sorting because it is a very common thing to have to
do, and therefore the standard for inclusion ought to be lower. I
don't intend to talk about tapesort though - that isn't really fair,
not least because I have some serious doubts about the quality of our
implementation. Furthermore, I think that it is logical that doing
things like resolving collations occur within a preparatory function
in advance of sorting, rather than redundantly doing that for each and
every comparison.

Why have you made the reusable buffer managed by sortsupport
TEXTBUFLEN-aligned? The existing rationale for that constant (whose
value is 1024) does not seem to carry forward here:

 * This should be large enough that most strings will be fit, but small
 * enough that we feel comfortable putting it on the stack.

ISTM it would be on average worth the hit of having to repalloc a few
more times for larger strings by making that buffer much smaller
initially, and doubling its size each time that proved insufficient,
rather than increasing its size to the smallest possible
TEXTBUFLEN-aligned size that you can get away with for the immediately
subsequent memcpy. Realistically, any database I've ever worked with
would probably be able to fit a large majority of its text strings
into 16 chars of memory - you yourself said that sorting toasted text
isn't at all common.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Fujii Masao

On Fri, Jun 15, 2012 at 2:29 AM, Robert Haas  wrote:
> On Thu, Jun 14, 2012 at 6:06 AM, Gabriele Bartolini
>  wrote:
>>    thank you very much for your patience (and thank you Marco for supporting
>> me). I apologise for the delay.
>>
>>    I have retested the updated patch and it works fine with me. It is "ready
>> for committer" for me.
>
> Committed.

I think the attached patch needs to be applied.

Regards,

-- 
Fujii Masao


doc_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 1:48 PM, Gurjeet Singh  wrote:
> On Thu, Jun 14, 2012 at 1:29 PM, Robert Haas  wrote:
>>
>> On Thu, Jun 14, 2012 at 6:06 AM, Gabriele Bartolini
>>  wrote:
>> >    thank you very much for your patience (and thank you Marco for
>> > supporting
>> > me). I apologise for the delay.
>> >
>> >    I have retested the updated patch and it works fine with me. It is
>> > "ready
>> > for committer" for me.
>>
>> Committed.
>
>
> A minor gripe:
>
> +     /*
> +     * Close the backup label file.
> +     */
> +     if (ferror(lfp) || FreeFile(lfp)) {
> +        ereport(ERROR,
> +            (errcode_for_file_access(),
> +            errmsg("could not read file \"%s\": %m",
> +                             BACKUP_LABEL_FILE)));
> +     }
> +
>
> If ferror(lfp) returns false, wouldn't we miss the FreeFile() and leak a
> file pointer?

Well, according to the comments for AllocateFile:

 * fd.c will automatically close all files opened with AllocateFile at
 * transaction commit or abort; this prevents FD leakage if a routine
 * that calls AllocateFile is terminated prematurely by ereport(ERROR).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 14 June 2012 17:35, Robert Haas  wrote:
> The problem with pre-detoasting to save comparison cycles is that you
> can now fit many, many fewer tuples in work_mem.  There might be cases
> where it wins (for example, because the entire data set fits even
> after decompressing everything) but in most cases it seems like a
> loser.

If I had to guess, I'd say you're probably right about that -
optimising sorting toasted text doesn't seem like a terribly sensible
use of your time.

What about the strxfrm suggestion of Greg's? You might find that the
added benefit of being able to avail of a highly optimised strcmp()
tipped the balance in favour of that idea, beyond the simple fact that
there's only a linear number of what you might loosely call "strcoll_l
units of work" rather than as many as O(n ^ 2). Furthermore, I'd
speculate that if you were to interlace the strxfrm() calls with
copying each text string, somewhere like within a specialised
datumCopy(), that would make the approach more efficient still, as you
specify a location for the blob in the just-palloc()'d leading-key
private memory directly, rather than just using memcpy.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Gurjeet Singh

On Thu, Jun 14, 2012 at 1:29 PM, Robert Haas  wrote:

> On Thu, Jun 14, 2012 at 6:06 AM, Gabriele Bartolini
>  wrote:
> >thank you very much for your patience (and thank you Marco for
> supporting
> > me). I apologise for the delay.
> >
> >I have retested the updated patch and it works fine with me. It is
> "ready
> > for committer" for me.
>
> Committed.


A minor gripe:

+ /*
+ * Close the backup label file.
+ */
+ if (ferror(lfp) || FreeFile(lfp)) {
+ereport(ERROR,
+(errcode_for_file_access(),
+errmsg("could not read file \"%s\": %m",
+ BACKUP_LABEL_FILE)));
+ }
+

If ferror(lfp) returns false, wouldn't we miss the FreeFile() and leak a
file pointer?

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] psql tab completion for GRANT role

2012-06-14 Thread Robert Haas

On Sun, Jan 8, 2012 at 3:48 PM, Peter Eisentraut  wrote:
> psql tab completion currently only supports the form GRANT privilege ON
> something TO someone (and the analogous REVOKE), but not the form GRANT
> role TO someone.  Here is a patch that attempts to implement the latter.

This seems to have fallen through the cracks.  It doesn't apply any
more, but one general comment is that it seems undesirable to
repeatedly recapitulate the list of all privileges that exist in the
system.  That's a lot of places that someone will have to find and fix
when new privileges are added.

But +1 on the general idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Namespace of array of user defined types is confused by the parser in insert?

2012-06-14 Thread Robert Haas

On Mon, Apr 23, 2012 at 9:42 AM, Krzysztof Nienartowicz
 wrote:
> Hello,
> Sorry for re-posting - I initially posted this in pgsql.sql - probably
> this group is more appropriate.
>
> I have a bizzare problem that started to manifest itself after
> addition of field being the array of compound UDTs to the table
> declared in multiple schemas.
> It is clearly related to how the type namespace is resolved and shows
> up for the JDBC client (probably related to the paramterized query, as
> the static query works without problems).

I'm replying to this awfully late, but I'm guessing this is some kind
of JDBC magic, not anything that PostgreSQL is causing directly.  You
might want to post to pgsql-jdbc, if you haven't already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 6:06 AM, Gabriele Bartolini
 wrote:
>    thank you very much for your patience (and thank you Marco for supporting
> me). I apologise for the delay.
>
>    I have retested the updated patch and it works fine with me. It is "ready
> for committer" for me.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-14 Thread Dave Page

On Thu, Jun 14, 2012 at 5:38 PM, Robert Haas  wrote:
> On Thu, Jun 14, 2012 at 11:59 AM, Dave Page  wrote:
>> On Thu, Jun 14, 2012 at 11:43 AM, Dave Page  wrote:
>>>
>>> I'll have a play with it and see if a simple switch to NetworkService
>>> seems feasible.
>>
>> OK, I worked up a patch which uses "NT AUTHORITY\NetworkService" as
>> the service account by default. This doesn't need a password, so
>> allows us to simply prompt during installation for the superuser
>> password for the cluster, and not at all during upgrade. If you run
>> the installer from the command line with "--serviceaccount postgres"
>> (or some other account name), you get the current behaviour.
>>
>> I've posted it on our internal ReviewBoard system for the rest of the
>> team to review and test on various platforms (I've only tried it on XP
>> so far). Now would be a very good time for people to yelp if they
>> think this is a bad idea (the only downside I can see if accessing
>> files for server-side copy etc, but users that want to do that can
>> install with a non-default account). If we go ahead, we'll include it
>> in 9.2.
>
> What happens if they try to use this to upgrade from the EDB 9.1 installers?

The installers don't support major version upgrades - it'll install
alongside 9.1.

If someone has an existing beta installation of 9.2, it'll use the
service account that was installed with, just as past minor-version
upgrades would (including asking for the password).

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 11:59 AM, Dave Page  wrote:
> On Thu, Jun 14, 2012 at 11:43 AM, Dave Page  wrote:
>>
>> I'll have a play with it and see if a simple switch to NetworkService
>> seems feasible.
>
> OK, I worked up a patch which uses "NT AUTHORITY\NetworkService" as
> the service account by default. This doesn't need a password, so
> allows us to simply prompt during installation for the superuser
> password for the cluster, and not at all during upgrade. If you run
> the installer from the command line with "--serviceaccount postgres"
> (or some other account name), you get the current behaviour.
>
> I've posted it on our internal ReviewBoard system for the rest of the
> team to review and test on various platforms (I've only tried it on XP
> so far). Now would be a very good time for people to yelp if they
> think this is a bad idea (the only downside I can see if accessing
> files for server-side copy etc, but users that want to do that can
> install with a non-default account). If we go ahead, we'll include it
> in 9.2.

What happens if they try to use this to upgrade from the EDB 9.1 installers?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 11:36 AM, Peter Geoghegan  wrote:
> On 18 March 2012 15:08, Tom Lane  wrote:
>> One other thing I've always wondered about in this connection is the
>> general performance of sorting toasted datums.  Is it better to detoast
>> them in every comparison, or pre-detoast to save comparison cycles at
>> the cost of having to push much more data around?  I didn't see any
>> discussion of this point in Robert's benchmarks, but I don't think we
>> should go very far towards enabling sortsupport for text until we
>> understand the issue and know whether we need to add more infrastructure
>> for it.  If you cross your eyes a little bit, this is very much like
>> the strxfrm question...
>
> I see the parallels.

The problem with pre-detoasting to save comparison cycles is that you
can now fit many, many fewer tuples in work_mem.  There might be cases
where it wins (for example, because the entire data set fits even
after decompressing everything) but in most cases it seems like a
loser.

Also, my guess is that most values people sort by are pretty short,
making this concern mostly academic.  Suppose you are sorting a bunch
of strings which might be either 100 characters in length or 1MB.  If
they're all 100 characters, you probably don't need to detoast.  If
they're all 1MB, you probably can't detoast without eating up a ton of
memory (and even if you have it, this might not be the best use for
it).  If you have a mix, detoasting might be affordable provided that
the percentage of long strings is small, but it's also not going to
save you much, because if the percentage of long strings is small,
then most comparisons will be between two short strings where we don't
save anything anyway.

All things considered, this seems to me to be aiming at a pretty
narrow target, but maybe I'm just not thinking about it creatively
enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC][PATCH] Logical Replication/BDR prototype and architecture

2012-06-14 Thread Robert Haas

On Wed, Jun 13, 2012 at 7:27 AM, Andres Freund  wrote:
> === Design goals for logical replication === :
> - in core
> - fast
> - async
> - robust
> - multi-master
> - modular
> - as unintrusive as possible implementation wise
> - basis for other technologies (sharding, replication into other DBMSs, ...)

I agree with all of these goals except for "multi-master".  I am not
sure that there is a need to have a multi-master replication solution
in core.  The big tricky part of multi-master replication is conflict
resolution, and that seems like an awful lot of logic to try to build
into core - especially given that we will want it to be extensible.

More generally, I would much rather see us focus on efficiently
extracting changesets from WAL and efficiently applying those
changesets on another server.  IMHO, those are the things that are
holding back the not-in-core replication solutions we have today,
particularly the first.  If we come up with a better way of applying
change-sets, well, Slony can implement that too; they are already
installing C code.  What neither they nor any other non-core solution
can implement is change-set extraction, and therefore I think that
ought to be the focus.

To put all that another way, I think it is a 100% bad idea to try to
kill off Slony, Bucardo, Londiste, or any of the many home-grown
solutions that are out there to do replication.  Even if there were no
technical place for third-party replication products (and I think
there is), we will not win many friends by making it harder to extend
and add on to the server.  If we build an in-core replication solution
that can be used all by itself, that is fine with me.  But I think it
should also be able to expose its constituent parts as a toolkit for
third-party solutions.

> While you may argue that most of the above design goals are already provided 
> by
> various trigger based replication solutions like Londiste or Slony, we think
> that thats not enough for various reasons:
>
> - not in core (and thus less trustworthy)
> - duplication of writes due to an additional log
> - performance in general (check the end of the above presentation)
> - complex to use because there is no native administration interface

I think that your parenthetical note "(and thus less trustworthy)"
gets at another very important point, which is that one of the
standards for inclusion in core is that it must in fact be trustworthy
enough to justify the confidence that users will place in it.  It will
NOT benefit the project to have two replication solutions in core, one
of which is crappy.  More, even if what we put in core is AS GOOD as
the best third-party solutions that are available, I don't think
that's adequate.  It has to be better.  If it isn't, there is no
excuse for preempting what's already out there.

I imagine you are thinking along similar lines, but I think that it
bears being explicit about.

> As we need a change stream that contains all required changes in the correct
> order, the requirement for this stream to reflect changes across multiple
> concurrent backends raises concurrency and scalability issues. Reusing the
> WAL stream for this seems a good choice since it is needed anyway and adresses
> those issues already, and it further means that we don't incur duplicate
> writes. Any other stream generating componenent would introduce additional
> scalability issues.

Agreed.

> We need a change stream that contains all required changes in the correct 
> order
> which thus needs to be synchronized across concurrent backends which 
> introduces
> obvious concurrency/scalability issues.
> Reusing the WAL stream for this seems a good choice since it is needed anyway
> and adresses those issues already, and it further means we don't duplicate the
> writes and locks already performance for its maintenance.

Agreed.

> Unfortunately, in this case, the WAL is mostly a physical representation of 
> the
> changes and thus does not, by itself, contain the necessary information in a
> convenient format to create logical changesets.

Agreed.

> The biggest problem is, that interpreting tuples in the WAL stream requires an
> up-to-date system catalog and needs to be done in a compatible backend and
> architecture. The requirement of an up-to-date catalog could be solved by
> adding more data to the WAL stream but it seems to be likely that that would
> require relatively intrusive & complex changes. Instead we chose to require a
> synchronized catalog at the decoding site. That adds some complexity to use
> cases like replicating into a different database or cross-version
> replication. For those it is relatively straight-forward to develop a proxy pg
> instance that only contains the catalog and does the transformation to textual
> changes.

The actual requirement here is more complex than "an up-to-date
catalog".  Suppose transaction X begins, adds a column to a table,
inserts a row, and commits.  That tuple needs to be interpreted using
th

Re: [HACKERS] libpq compression

2012-06-14 Thread Merlin Moncure

On Thu, Jun 14, 2012 at 9:57 AM, Phil Sorber  wrote:
> It doesn't sound like there is a lot of support for this idea, but I
> think it would be nice to get something like lz4
> (http://code.google.com/p/lz4/) or snappy
> (http://code.google.com/p/snappy/) support. Both are BSD-ish licensed.
> It could be useful for streaming replication as well. A hook (as Euler
> mentioned) might be a nice compromise.

There is a lot of support for the idea: it's one of the more requested
features.  I think a well thought out framework that bypassed the
dependency issues via plugging might get some serious traction.
Emphasis 'on well thought out' :-).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-14 Thread Dave Page

On Thu, Jun 14, 2012 at 11:43 AM, Dave Page  wrote:
>
> I'll have a play with it and see if a simple switch to NetworkService
> seems feasible.

OK, I worked up a patch which uses "NT AUTHORITY\NetworkService" as
the service account by default. This doesn't need a password, so
allows us to simply prompt during installation for the superuser
password for the cluster, and not at all during upgrade. If you run
the installer from the command line with "--serviceaccount postgres"
(or some other account name), you get the current behaviour.

I've posted it on our internal ReviewBoard system for the rest of the
team to review and test on various platforms (I've only tried it on XP
so far). Now would be a very good time for people to yelp if they
think this is a bad idea (the only downside I can see if accessing
files for server-side copy etc, but users that want to do that can
install with a non-default account). If we go ahead, we'll include it
in 9.2.

Thanks for the feedback folks.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Allow WAL information to recover corrupted pg_controldata

2012-06-14 Thread Amit Kapila

I am planning to work on the below Todo list item for this CommitFest 
Allow WAL information to recover corrupted pg_controldata 
http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php 

I wanted to confirm my understanding about the work involved for this patch:

The existing patch has following set of problems: 
   1. Memory leak and linked list code path is not proper 
   2. lock check for if the server is already running, is removed in patch
which needs to be reverted 
   3. Refactoring of the code.

Apart from above what I understood from the patch is that its intention is
to generate values for ControlFile using WAL logs when -r option is used. 

The change in algorithm from current will be if control file is corrupt
which essentialy means ReadControlFile() will return False, then it should 
generate values (checkPointCopy, checkPoint, prevCheckPoint, state) using
WAL if -r option is enabled. 

Also for -r option, it doesn't need to call function FindEndOfXLOG() as the
that work will be achieved by above point. 

It will just rewrite the control file and don't do other resets. 

  
The algorithm of restoring the pg_control value from old xlog file: 
   1. Retrieve all of the active xlog files from xlog direcotry into a list
by increasing order, according their timeline, log id, segment id. 
   2. Search the list to find the oldest xlog file of the lastest time line.

   3. Search the records from the oldest xlog file of latest time line to
the latest xlog file of latest time line, if the checkpoint record 
  has been found, update the latest checkpoint and previous checkpoint. 

 

Apart from above some changes in code will be required after the Xlog patch
by Heikki.
  
Suggest me if my understanding is correct?

Re: [HACKERS] sortsupport for text

2012-06-14 Thread Peter Geoghegan

On 18 March 2012 15:08, Tom Lane  wrote:
> One other thing I've always wondered about in this connection is the
> general performance of sorting toasted datums.  Is it better to detoast
> them in every comparison, or pre-detoast to save comparison cycles at
> the cost of having to push much more data around?  I didn't see any
> discussion of this point in Robert's benchmarks, but I don't think we
> should go very far towards enabling sortsupport for text until we
> understand the issue and know whether we need to add more infrastructure
> for it.  If you cross your eyes a little bit, this is very much like
> the strxfrm question...

I see the parallels. I note that glibc's strcoll_l() is implemented
entirely in C (strcoll() itself is implemented in terms of strcoll_l()
), whereas the various strcmp.S are written in hand-optimized
assembler, with SSE3 instructions in the "Highly optimized version for
x86-64", for example. I wonder just how important a factor that is. I
suppose the reason why the glibc guys haven't just done something
equivalent internally might be that they much prefer to perform the
comparison in-place, due to the need to target a conservative lowest
common denominator...or it could be because it just doesn't matter
that much.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 02/16] Add zeroRecPtr as a shortcut for initializing a local variable to {0, 0}

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 04:04:22 PM Robert Haas wrote:
> On Thu, Jun 14, 2012 at 9:57 AM, Andres Freund  
wrote:
> > On Thursday, June 14, 2012 03:50:28 PM Robert Haas wrote:
> >> On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund 
> > 
> > wrote:
> >> > This is locally defined in lots of places and would get introduced
> >> > frequently in the next commits. It is expected that this can be
> >> > defined in a header-only manner as soon as the XLogInsert scalability
> >> > groundwork from Heikki gets in.
> >> 
> >> This appears to be redundant with the existing InvalidXLogRecPtr,
> >> defined in access/transam.h.
> > 
> > Oh. I didn't find that one. Judging from all the code defining local
> > variants of it I am not alone in that... Why is it in transam.h and not
> > xlogdefs.h?
> 
> Uh, not sure.  We used to have a variable by that name defined in a
> bunch of places, and I cleaned it up some in commit
> 503c7305a1e379f95649eef1a694d0c1dbdc674a.  But if there are still more
> redundant definitions floating around, it would be nice to clean those
> up.
Forget it, they are in things that don't link to the backend... /me looks 
forward to the 64bit conversion of XLogRecPtr's.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 10:45 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Is RESOURCE_RELEASE_AFTER_LOCKS actually used for anything?  Is it
>> just for extensions?
>
> I'm too lazy to go look, but it certainly ought to be in use.
> The idea is that that's the phase for post-lock-release cleanup,
> and anything that can possibly be postponed till after releasing
> locks certainly should be ...

Oh, you're right.  I missed the logic in ResourceOwnerReleaseInternal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Phil Sorber

On Thu, Jun 14, 2012 at 10:14 AM, Florian Pflug  wrote:
> On Jun14, 2012, at 15:28 , Euler Taveira wrote:
>> On 14-06-2012 02:19, Tom Lane wrote:
>>> I still think that pushing this off to openssl (not an ssh tunnel, but
>>> the underlying transport library) would be an adequate solution.
>>> If you are shoving data over a connection that is long enough to need
>>> compression, the odds that every bit of it is trustworthy seem pretty
>>> small, so you need encryption too.
>>>
>> I don't want to pay the SSL connection overhead. Also I just want 
>> compression,
>> encryption is not required. OpenSSL give us encryption with/without
>> compression; we need an option to obtain compression in non-SSL connections.
>
> AFAIR, openssl supports a NULL cipher which doesn't do any encryption. We
> could have a connection parameter, say compress=on, which selects that
> cipher (unless sslmode is set to prefer or higher, of course).
>
> SSL NULL-chipher connections would be treated like unencrypted connections
> when matching against pg_hba.conf.
>
> best regards,
> Florian Pflug
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

It doesn't sound like there is a lot of support for this idea, but I
think it would be nice to get something like lz4
(http://code.google.com/p/lz4/) or snappy
(http://code.google.com/p/snappy/) support. Both are BSD-ish licensed.
It could be useful for streaming replication as well. A hook (as Euler
mentioned) might be a nice compromise.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-14 Thread Tom Lane

Robert Haas  writes:
> Is RESOURCE_RELEASE_AFTER_LOCKS actually used for anything?  Is it
> just for extensions?

I'm too lazy to go look, but it certainly ought to be in use.
The idea is that that's the phase for post-lock-release cleanup,
and anything that can possibly be postponed till after releasing
locks certainly should be ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Backup docs

2012-06-14 Thread Robert Haas

On Wed, Jun 13, 2012 at 3:20 PM, Dimitri Fontaine
 wrote:
> Please let's apply that documentation patch to 9.2 too.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)

2012-06-14 Thread Robert Haas

On Sun, Jun 10, 2012 at 5:37 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Jun 10, 2012 at 4:19 PM, Noah Misch  wrote:
>>> Agreed.  We now have $OLD_SUBJECT, but this is a win independently.  I have
>>> reviewed the code that runs between the old and new call sites, and I did 
>>> not
>>> identify a hazard of moving it as you describe.
>
>> I looked at this when we last discussed it and didn't see a problem
>> either, so I tend to agree that we ought to go ahead and do this,
>
> +1, as long as you mean in 9.3 not 9.2.  I don't see any risk either,
> but the time for taking new risks in 9.2 is past.
>
> Noah, please add this patch to the upcoming CF, if you didn't already.

I re-reviewed this and committed it.

Is RESOURCE_RELEASE_AFTER_LOCKS actually used for anything?  Is it
just for extensions?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-14 Thread Amit Kapila

> That seems undesirable.  I don't think this is important enough to be
worth reparsing > the connection string for.  
The connect string should not be long, so parsing is not a big cost
performance wise.
I have currently modified the code for dblink in the patch I have uploaded
in CF.
However as per your suggestion, I will remove it during handling of other
Review comments for patch unless somebody asks to keep it.

-Original Message-
From: Robert Haas [mailto:robertmh...@gmail.com] 
Sent: Thursday, June 14, 2012 7:25 PM
To: Amit Kapila
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] WIP patch for Todo Item : Provide
fallback_application_name in contrib/pgbench, oid2name, and dblink

On Wed, Jun 13, 2012 at 12:07 AM, Amit Kapila 
wrote:
>
>> Why not 'dblink'?
>
> We can do for dblink as well. I just wanted to check before implementing
in
> dblink.
>
> I have checked the dblink_connect() function, it receives the connect
string
> and used in most cases that string to
> call libpq connect which is different from pgbench and oid2name where
> connection parameters are formed in main function and then call libpq
> connect.
>
> To achieve the same in dblink, we need to parse the passed connection
string
> and check if it contains fallback_application_name, if yes then its okay,
> otherwise we need to append fallback_application_name in connection
string.

That seems undesirable.  I don't think this is important enough to be
worth reparsing the connection string for.  I'd just forget about
doing it for dblink if there's no cheaper way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Florian Pflug

On Jun14, 2012, at 15:28 , Euler Taveira wrote:
> On 14-06-2012 02:19, Tom Lane wrote:
>> I still think that pushing this off to openssl (not an ssh tunnel, but
>> the underlying transport library) would be an adequate solution.
>> If you are shoving data over a connection that is long enough to need
>> compression, the odds that every bit of it is trustworthy seem pretty
>> small, so you need encryption too.
>> 
> I don't want to pay the SSL connection overhead. Also I just want compression,
> encryption is not required. OpenSSL give us encryption with/without
> compression; we need an option to obtain compression in non-SSL connections.

AFAIR, openssl supports a NULL cipher which doesn't do any encryption. We
could have a connection parameter, say compress=on, which selects that
cipher (unless sslmode is set to prefer or higher, of course).

SSL NULL-chipher connections would be treated like unencrypted connections
when matching against pg_hba.conf.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 02/16] Add zeroRecPtr as a shortcut for initializing a local variable to {0, 0}

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 9:57 AM, Andres Freund  wrote:
> On Thursday, June 14, 2012 03:50:28 PM Robert Haas wrote:
>> On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund 
> wrote:
>> > This is locally defined in lots of places and would get introduced
>> > frequently in the next commits. It is expected that this can be defined
>> > in a header-only manner as soon as the XLogInsert scalability groundwork
>> > from Heikki gets in.
>>
>> This appears to be redundant with the existing InvalidXLogRecPtr,
>> defined in access/transam.h.
> Oh. I didn't find that one. Judging from all the code defining local variants
> of it I am not alone in that... Why is it in transam.h and not xlogdefs.h?

Uh, not sure.  We used to have a variable by that name defined in a
bunch of places, and I cleaned it up some in commit
503c7305a1e379f95649eef1a694d0c1dbdc674a.  But if there are still more
redundant definitions floating around, it would be nice to clean those
up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] creating objects in pg_catalog

2012-06-14 Thread Robert Haas

On Thu, Jun 7, 2012 at 8:44 AM, Robert Haas  wrote:
> On Wed, Jun 6, 2012 at 5:21 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> rhaas=# create table pg_catalog.tom (a int);
>>> ERROR:  permission denied to create "pg_catalog.tom"
>>
>>> The offending error check is in heap_create(), and based on what
>>> you're saying here it seems like we should just rip it out.
>>
>> Hmm.  Yeah, it seems like the regular permissions tests on the schemas
>> in question should be enough to keep Joe User from making tables there,
>> and I do not see a reason why the backend would care if there are
>> non-catalog tables laying about in pg_catalog.
>>
>> Checking the commit history, it seems this was originally a test to
>> prevent people from creating tables named "pg_xxx":
>>
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f5a10e722c052006886b678995695001958a#patch3
>>
>> which may or may not have been critical once upon a time, but surely is
>> not any more.
>>
>> So no objection to removing that particular test.
>
> Patch attached.  Barring objections, I'll apply this to 9.3 once we branch.

Hearing no objections, done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 02/16] Add zeroRecPtr as a shortcut for initializing a local variable to {0, 0}

2012-06-14 Thread Andres Freund

On Thursday, June 14, 2012 03:50:28 PM Robert Haas wrote:
> On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund  
wrote:
> > This is locally defined in lots of places and would get introduced
> > frequently in the next commits. It is expected that this can be defined
> > in a header-only manner as soon as the XLogInsert scalability groundwork
> > from Heikki gets in.
> 
> This appears to be redundant with the existing InvalidXLogRecPtr,
> defined in access/transam.h.
Oh. I didn't find that one. Judging from all the code defining local variants 
of it I am not alone in that... Why is it in transam.h and not xlogdefs.h?

Obviously that patch is void then. Doesn't warrant rebasing the other patches 
yet though...

Thanks!

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2012-06-14 Thread Robert Haas

On Wed, Jun 13, 2012 at 12:07 AM, Amit Kapila  wrote:
>
>> Why not 'dblink'?
>
> We can do for dblink as well. I just wanted to check before implementing in
> dblink.
>
> I have checked the dblink_connect() function, it receives the connect string
> and used in most cases that string to
> call libpq connect which is different from pgbench and oid2name where
> connection parameters are formed in main function and then call libpq
> connect.
>
> To achieve the same in dblink, we need to parse the passed connection string
> and check if it contains fallback_application_name, if yes then its okay,
> otherwise we need to append fallback_application_name in connection string.

That seems undesirable.  I don't think this is important enough to be
worth reparsing the connection string for.  I'd just forget about
doing it for dblink if there's no cheaper way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Restrict ALTER FUNCTION CALLED ON NULL INPUT (was Re: Not quite a security hole: CREATE LANGUAGE for non-superusers)

2012-06-14 Thread Noah Misch

On Tue, Jun 12, 2012 at 03:13:26PM -0400, Tom Lane wrote:
> Even if I accepted that premise, this patch is a pretty bad
> implementation of it, because it restricts cases that there is no
> reason to think are unsafe.
> 
> A less bizarre and considerably more future-proof restriction, IMO,
> would simply refuse any attempt to give ownership of a C function
> to a non-superuser.

That just moves the collateral damage to a different set of victims.

Hardcoding the list of vulnerable languages isn't so "future-proof".  I'd
otherwise agree in principle if we were designing a system from scratch, but
it doesn't fit with the need to harmoniously protect existing systems.  Adding
this restriction now would make some existing databases fail to restore from
dumps.  That is grave enough at any juncture, let alone for a backpatched fix.

On Tue, Jun 12, 2012 at 04:14:48PM -0400, Tom Lane wrote:
> Basically, if we go down the road Noah is proposing, I foresee a steady
> stream of security bugs and ensuing random restrictions on what function
> owners can do.  I do not like that future.

Then let's have every ALTER FUNCTION require the same language usage check as
CREATE FUNCTION.  Or, if you insist, do so only for the hardcoded cases of
INTERNALlanguageId and ClanguageId, and document that no third-party PL may
rely on STRICT to the extent they do.  This of course forbids more than
necessary but still strictly less than your proposal of blocking the original
ownership change.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 02/16] Add zeroRecPtr as a shortcut for initializing a local variable to {0, 0}

2012-06-14 Thread Robert Haas

On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund  wrote:
> This is locally defined in lots of places and would get introduced frequently
> in the next commits. It is expected that this can be defined in a header-only
> manner as soon as the XLogInsert scalability groundwork from Heikki gets in.

This appears to be redundant with the existing InvalidXLogRecPtr,
defined in access/transam.h.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] \conninfo and SSL

2012-06-14 Thread Robert Haas

On Sun, Jun 3, 2012 at 5:30 AM, Alastair Turner  wrote:
> A one-line change adds the SSL info on its own line like
>
> --
> You are connected to database "scratch" as user "scratch" on host
> "127.0.0.1" at port "5432".
> SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
> --
>
> Does this need a more integrated presentation, and therefore a broader
> change to make it translatable?

Committed to master.  I didn't make it conditional on a non-local
connection, though, since there seemed to be no reason to it that way.

P.S. Email mangles patches.  It's better to attach them rather than
including them inline.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Euler Taveira

On 14-06-2012 02:19, Tom Lane wrote:
> I still think that pushing this off to openssl (not an ssh tunnel, but
> the underlying transport library) would be an adequate solution.
> If you are shoving data over a connection that is long enough to need
> compression, the odds that every bit of it is trustworthy seem pretty
> small, so you need encryption too.
> 
I don't want to pay the SSL connection overhead. Also I just want compression,
encryption is not required. OpenSSL give us encryption with/without
compression; we need an option to obtain compression in non-SSL connections.

> We do need the ability to tell openssl to use compression.  We don't
> need to implement it ourselves, nor to bring a bunch of new library
> dependencies into our builds.  I especially think that importing bzip2
> is a pretty bad idea --- it's not only a new dependency, but bzip2's
> compression versus speed tradeoff is entirely inappropriate for this
> use-case.
> 
I see, the idea is that bzip2 would be a compiled-in option (not enabled by
default) just to give another compression option. I don't have a strong
opinion about including it as another dependency. We already depend on zlib
and implementing compression using it won't add another dependency.

What do you think about adding a hook at libpq to load an extension that does
the compression? That way we don't add another dependency at libpq and also a
lot of extensions could be coded to cover a variety of algorithms without
putting us in trouble because of patent infringement.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ability to listen on two unix sockets

2012-06-14 Thread Robert Haas

On Thu, Jun 14, 2012 at 1:04 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Maybe:
>
>> listen_addresses = { host | host:port | * | *:port } [, ...]
>> unix_socket_directory = { directory | directory:port ] [,...]
>
>> ...except that colon is a valid character in a directory name.  Not
>> sure what to do about that.
>
> Do we need to do anything?  There are no standard scenarios in which a
> colon would appear in such paths, and I don't see why we need to cater
> for it.  (Remember that Windows doesn't enter into this ...)

True.  And, we should be able to design the parsing algorithm so that
they can fix it by adding an otherwise-redundant port specification -
i.e. if you want to put the socket in a directory called /me:1, then
write /me:1:5432

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-06-14 Thread Marco Nenciarini

Il giorno mer, 13/06/2012 alle 18.07 -0400, Noah Misch ha scritto:
> On Wed, Jun 13, 2012 at 10:12:18PM +0200, Gabriele Bartolini wrote:
> > Our goal is to work on this patch from the next commit fest.
> >
> > What we are about to do for this commit fest is to split the previous  
> > patch and send a small one just for the array_remove() and  
> > array_replace() functions.
> >
> > Then we will sit down and organise the development of the feature  
> > according to Peter's feedback. It is important indeed that we find a  
> > commonly accepted terminology and syntax for this feature.
> >
> > I hope this sounds like a reasonable plan. Thank you.
> 
> Sounds fine.  The 2012-01 CF entry for this patch has been moved to the
> 2012-06 CF.  Please mark that entry Returned with Feedback and create a new
> entry for the subset you're repackaging for this CommitFest.
> 
> Thanks,
> nm
> 

Dear Noah,

   I have added a new patch to the commitfest (miscellaneous category)
entitled "Support for array_remove and array_replace functions". I have
also marked the "Foreign Key Array" patch as " Returned with Feedback"
as per your request. We will start working again on this patch in the
next commitfest as anticipated by Gabriele, hoping that the array
functions will be committed by then.

   Thank you.

Cheers,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minimising windows installer password confusion

2012-06-14 Thread Dave Page

On Thu, Jun 14, 2012 at 12:55 AM, Craig Ringer
 wrote:
> On 06/13/2012 05:14 PM, Dave Page wrote:
>>
>> On Wed, Jun 13, 2012 at 2:18 AM, Craig Ringer
>>  wrote:
>>>
>>> On 06/12/2012 08:08 PM, Dave Page wrote:

 Some background: By default the installer will use 'postgres' for both
 the service (OS) account, and the database superuser account. It will
 use the same password for both (though, users have complete control at
 the command line if they want it, which is why the account names are
 shown in the installer). Unlike *nix installations, the service
 account must have a password, which is required during both
 installation and upgrade to configure the PostgreSQL service. We
 therefore ask for the password during both installation and upgrade.
 If the user account exists (which can happen if there has previously
 been an installation of Postgres on the system), the user must specify
 the existing password for the account during installation (and of
 course, during all upgrades). This is where users normally get stuck,
 as they've forgotten the password they set in the past.
>>>
>>> They may not even have set it, because the EnterpriseDB installer can be
>>> run
>>> unattended and may've been used by some 3rd party package.
>>>
>>> I'd be interested in seeing what Microsoft installers that create
>>> isolated
>>> user accounts do. I think .NET creates one for ASP, so that'd be one
>>> option
>>> to look into.
>>
>> They tend to use the localsystem or networkservice accounts for most
>> things, which don't have passwords. The reason we don't do that is
>> that since the early days of 8.0 we've said we want to enable users to
>> use the service account as if it were a regular account, so that they
>> can do things like access network resources (useful for server-side
>> copy for example).
>>
>> I wasn't overly convinced back then that that was necessary, and I'm
>> still not now. I suppose we potentially could use the networkservice
>> account by default, and let advanced users override that on the
>> command line if they need to. Then remembering the password becomes
>> their responsibility.
>
> +1 from me on this, though I think making the service account part of the
> install process makes sense.
>
> SERVICE ACCOUNT
> -
>
> You can probably just accept the default here.
>
> PostgreSQL runs in the background as a network service in a user account
> that
> only has limited access to the files and programs on the computer. This is
> fine
> for most purposes, but will prevent certain operations like server-side COPY
> and
> direct access by the server to resources on shared folders from working. If
> you
> need these capabilities, we recommend that you create a "postgres" user
> account
> below and have the PostgreSQL server run in that instead of the default
> NetworkService account.
>
> -- [service account] ---
> |                                          |
> | [*] LocalService                         |
> |                                          |
> | [ ] Custom Service Account               |
> |                                          |
> | -- [custom account]--|   |
> | |                                    |   |
> | | Username: [                 ]      |   |
> | | Password: [                 ]      |   |
> | | Domain:   [ THISCOMPUTER    ]      |   |
> | |                                    |   |
> | ||   |
> |--|
>
> Reasonable option?

Too complicated - it'll confuse users too (and it goes against the
primary goal of the installers which is to setup the server in a
suitable way for production use, with the absolute minimum of user
interaction). We try to provide more complex config options that 99%
of users won't need through the command line only (though, in the
future I guess it might make sense to have a "Show advanced options"
checkbox on the first page, though that's definitely out of scope for
9.2).

I'll have a play with it and see if a simple switch to NetworkService
seems feasible.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Support for array_remove and array_replace functions

2012-06-14 Thread Marco Nenciarini

Hi,

  following Gabriele's email regarding our previous patch on "Foreign
Key Arrays"[1], I am sending a subset of that patch which includes only
two array functions which will be needed in that patch: array_remove
(limited to single-dimensional arrays) and array_replace.

  The patch includes changes to the documentation.

Cheers,
Marco

[1] http://archives.postgresql.org/message-id/4FD8F422.40709%402ndQuadrant.it

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it 



array-functions.patch.bz2
Description: application/bzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Gabriele Bartolini


Hello Gilles,

   thank you very much for your patience (and thank you Marco for 
supporting me). I apologise for the delay.


   I have retested the updated patch and it works fine with me. It is 
"ready for committer" for me.


Cheers,
Gabriele


Il 14/06/12 11:36, Marco Nenciarini ha scritto:

Hi Gilles,

unfortunately Gabriele has been very busy recently and he asked me to
check again the status of this patch for this commit fest. In order to
speed up the application of the patch, I am sending an updated version
which correctly compiles with current head. Gabriele will then proceed
with the review.

Thank you,
Marco



This body part will be downloaded on demand.



--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

Re: [HACKERS] Patch pg_is_in_backup()

2012-06-14 Thread Marco Nenciarini

Hi Gilles,

unfortunately Gabriele has been very busy recently and he asked me to
check again the status of this patch for this commit fest. In order to
speed up the application of the patch, I am sending an updated version
which correctly compiles with current head. Gabriele will then proceed
with the review.

Thank you,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it 



pg_backup_start_time-and-pg_is_in_backup-v4.1.patch.bz2
Description: application/bzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.

2012-06-14 Thread Dimitri Fontaine

Tom Lane  writes:
> Well, TBH I didn't think that concept was a useful solution for this at
> all.  I can't imagine that we would define "features" at a sufficiently
> fine granularity, or with enough advance foresight, to solve the sort of
> problem that's being faced here.  How would you deal with the need for,
> say, some of contrib/xml2's functions to get migrated to core in 9.4 or
> so?  When you didn't know that would happen, much less exactly which
> functions, when 9.3 came out?  AFAICS the only way that "features" could

You can provide and require new feature names on the same dependency
graph, and the patch allows us to do so without foresight. The trade off
here is that we would need to edit the previous version of PostgreSQL to
add some new feature names.

In your example we would add a feature name to the contrib/xmls
extensions in a 9.3 minor release and add the same feature to 9.4 core
distribution. That requires minor upgrade before major upgrade for us to
have a chance to deal with lack of foresight.

> fix this would be if we always created a feature for every exposed
> function, which is unmanageable.

Agreed, exposing each function as a feature is not the way to go.

> AFAICS, the SQL-standard features list is just about entirely irrelevant
> to this discussion.  How often is it going to happen that we implement a
> standard feature in a contrib module before migrating it into core?

I wanted the standard to help me with the "core features" idea, I agree
it's not a smart move here.

> I think almost every case in which we'll face this problem will involve
> a PG-specific feature not mentioned in the SQL feature taxonomy.  The
> case at hand (some proposed new functions for managing replication)
> certainly isn't going to be found there.

Agreed.

> And, quite aside from whether we could invent feature names that match
> what we want to move from contrib to core, exactly how would having a
> feature name help?  The problem we're actually facing is getting
> pg_upgrade to not dump particular functions when it's doing a
> binary-upgrade dump of an extension.  Maybe I've forgotten, but I do not
> recall any exposed connection between feature names and particular SQL
> objects in your proposal.

We're talking about several distinct problems here, one is low level
upgrade mechanism to keep OIDs and the other is about upgrading to a
newer version of the same extension, which the content changed. And we
want to do both, of course.

The idea would be to implement the upgrade as you're proposing, or at
least my understanding of it, which is to just issue a create extension
command as part of the schema creation. That will run the new extension
script which will know not to install deprecated functions.

The problem with that is to conserve OIDs that might be stored in user
relations, such as types. If we want pg_upgrade to cope with upgrading
an extension to a new content, we have to change its implementation.

An idea would be to add hooks in the backend code at OID attribution
time so that pg_upgrade can attach code that will provide the OID. As
input it should have creating_extension, extension name, schema name and
object name, and also object "argument list" for more complex things
such as functions and aggregates.

I don't think the "extension features" dependency management patch
should be fixing the running the newer extension script while keeping
existing OID list by itself.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq compression

2012-06-14 Thread Albe Laurenz

Euler Taveira wrote:
> There was already some discussion about compressing libpq data
[1][2][3].
> Recently, I faced a scenario that would become less problematic if we
have had
> compression support. The scenario is frequent data load (aka COPY)
over
> slow/unstable links. It should be executed in a few hundreds of
PostgreSQL
> servers all over Brazil. Someone could argue that I could use ssh
tunnel to
> solve the problem but (i) it is complex because it involves a
different port
> in the firewall and (ii) it's an opportunity to improve other
scenarios like
> reducing bandwidth consumption during replication or normal operation
over
> slow/unstable links.

Maybe I'm missing something obvious, but shouldn't a regular SSL
connection (sslmode=require) do what you are asking for?

At least from OpenSSL 0.9.8 on, data get compressed by default.
You don't need an extra port in the firewall for that.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hint bit i/o reduction

2012-06-14 Thread Amit Kapila

> yeah -- but you still have to call:
> *) TransactionIdIsCurrentTransactionId(), which adds a couple of tests
> (and a bsearch in the presence of subtransactions)
> *) TransactionIdIsInProgress() which adds a few tests and a out of
   line call in the typical case
> *) TransactionIdDidCommit() which adds a few tests and two out of line
  calls in the typical case
and, while setting the hint it:
> *) Several more tests plus:
> *) TransactionIdGetCommitLSN()  another out of line call and a test
> *) XLogNeedsFlush() two more out of line calls plus a few tests
> *) SetBufferCommitInfoNeedsSave out of line call, several more tests

>adding a few instructions to the above probably won't be measurable

You are right, adding in above path should not impact any performance and
moreover all the checks and assignment you are purposing is backend local so
no contention also.

Just an observation that in some cases, it will just do only point-1 or
point-1 and point-2.
For example 
1. xmin is committed and hint bit is already set.
2. xmax is hint bit is not set, and its still not visible to current
transaction.
In this case it will just do point-1 and point-2.


-Original Message-
From: Merlin Moncure [mailto:mmonc...@gmail.com] 
Sent: Wednesday, June 13, 2012 8:36 PM
To: Amit Kapila
Cc: PostgreSQL-development
Subject: Re: [HACKERS] hint bit i/o reduction

On Wed, Jun 13, 2012 at 9:02 AM, Amit Kapila  wrote:
>> Yes, but only in the unhinted case -- in oltp workloads tuples get
>> hinted fairly quickly so I doubt this would be a huge impact.  Hinted
>> scans will work exactly as they do now.  In the unhinted case for OLTP
>> a  few tests are added but that's noise compared to the other stuff
>> going on.
>
> I believe the design you have purposed is for the unhinted cases only,
means
> if the tuple has already hint set then it will not go to new logic. Is
that
> right?
>
> In unhinted cases, there can be 2 ways hint bit can be set
> 1. It gets the transaction status from Clog memory buffers
> 2. It needs to do I/O to get the transaction status from Clog.
>
> So, I think it will add overhead in case-1 where the processing is fast,
but
> now we will add some new instructions to it.

yeah -- but you still have to call:
*) TransactionIdIsCurrentTransactionId(), which adds a couple of tests
(and a bsearch in the presence of subtransactions)
*) TransactionIdIsInProgress() which adds a few tests and a out of
line call in the typical case
*) TransactionIdDidCommit() which adds a few tests and two out of line
calls in the typical case
and, while setting the hint it:
*) Several more tests plus:
*) TransactionIdGetCommitLSN()  another out of line call and a test
*) XLogNeedsFlush() two more out of line calls plus a few tests
*) SetBufferCommitInfoNeedsSave out of line call, several more tests

adding a few instructions to the above probably won't be measurable
(naturally it would be tested to confirm that).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

76 matches

Mail list logo