Re: [HACKERS] Git diff patch in context diff format

2012-08-08 Thread Qi Huang
> Date: Wed, 8 Aug 2012 15:05:06 -0400
> From: and...@dunslane.net
> To: br...@momjian.us
> CC: huangq...@outlook.com; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Git diff patch in context diff format
> 
> 
> On 08/08/2012 01:29 PM, Bruce Momjian wrote:
> > On Thu, Aug  2, 2012 at 05:03:04PM +0800, Qi Huang wrote:
> >> Hi, hackers
> >>  I was exporting my project to a patch file. As the patch review 
> >> requires,
> >> the patch needs to be in context diff format 
> >> (http://wiki.postgresql.org/wiki/
> >> Reviewing_a_Patch). But the git diff exports in a format similar to unified
> >> format. What is everyone doing with patching now? Is there any standard 
> >> way?
> > Have you read our wiki about git and diffs?
> >
> > http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git
> 
> 
> 
> I must confess that, like Robert, I just use:
> 
> git diff | filterdiff --format=context
> 
> I tried the git config stuff mentioned on the wiki, and it bit me a few 
> times, I forget exactly how, and this just works without hassle.
> 
> cheers
> 
> andrew
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

I'm sorry, but actually this issue has already been resolved. I don't why the 
system sent another mail... Thanks for your response.

Best RegardsHuang Qi VictorComputer Science of National University of Singapore
  

Re: [HACKERS] Prevent restored WAL files from being archived again Re: Unnecessary WAL archiving after failover

2012-08-08 Thread Simon Riggs
On 29 July 2012 16:01, Fujii Masao  wrote:

> Attached patch changes the startup process so that it creates .done file
> whenever WAL file is successfully restored, whether archive mode is
> enabled or not. The restored WAL files will not be archived again because
> of .done file.

The proposed patch works, for archiving only, but I don't like the
code. It's a partial refactoring of existing code.

I prefer to go for a full re-factoring version for HEAD, and a zero
refactoring version for 9.2 since we're deep into beta.

I've committed the simplified version for 9.2, as well as adding
support for streaming which you seem to have missed out.

Will look at the refactored version tomorrow.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Bruce Momjian
On Wed, Aug  8, 2012 at 06:42:27PM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Excerpts from Bruce Momjian's message of mié ago 08 17:15:38 -0400 2012:
> >> On Wed, Aug  8, 2012 at 04:23:04PM -0400, Robert Haas wrote:
> >>> I think this is one good idea:
> >>> http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us
> 
> >> If we currently require 14 steps to use pg_upgrade, how would that
> >> reduce this number?  What failures does it fix?
> 
> > The suggestion by Tom reduces the list by two steps because it doesn't
> > need to adjust pg_hba.conf or put it back in the original way
> > afterwards.
> 
> Even more to the point, it flat-out eliminates failure modes associated
> with somebody connecting to either the old or the new cluster while
> pg_upgrade is working.  Schemes that involve temporarily hacking
> pg_hba.conf can't provide any iron-clad guarantee, because if pg_upgrade
> can connect to a postmaster, so can somebody else.

We now use a temporary port number, 50432.

> The point I think Robert was trying to make is that we need to cut down
> not only the complexity of running pg_upgrade, but the number of failure
> modes.  At least that's how I'd define improvement here.

Agreed.  Even with these changes, I still see a lot of complexity.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Bruce Momjian
On Wed, Aug  8, 2012 at 05:29:49PM -0400, Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of mié ago 08 17:15:38 -0400 2012:
> > On Wed, Aug  8, 2012 at 04:23:04PM -0400, Robert Haas wrote:
> > > On Tue, Aug 7, 2012 at 10:59 AM, Bruce Momjian  wrote:
> > > > Yes, the list of rough edges is the 14-steps you have to perform to run
> > > > pg_upgrade, as documented in the pg_upgrade manual page:
> > > >
> > > > http://www.postgresql.org/docs/9.2/static/pgupgrade.html
> > > >
> > > > The unknown is how to reduce the number of steps in a way the community
> > > > would find acceptable.
> > > 
> > > I think this is one good idea:
> > > 
> > > http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us
> > > 
> > > The number of steps is an issue, but the likelihood of the actual
> > > pg_upgrade run failing or doing the wrong thing is also something we
> > > need to work on.
> > 
> > If we currently require 14 steps to use pg_upgrade, how would that
> > reduce this number?  What failures does it fix?
> 
> I think those 14 is a bit of a made-up number.  Several of those steps
> are about building pg_upgrade, not actually using it.  And there are
> some that are optional anyway.
> 
> The suggestion by Tom reduces the list by two steps because it doesn't
> need to adjust pg_hba.conf or put it back in the original way
> afterwards.

True.

> Another thing worth considering is to have pg_upgrade init, stop and
> start clusters as necessary instead of requesting the user to do it.
> I think this is two less steps.

pg_upgrade already starts/stops the server --- it just checks to make
sure they are both stopped.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make "psql -1 < file.sql" work as with "-f"

2012-08-08 Thread David Fetter
On Wed, Aug 08, 2012 at 04:55:43PM -0400, Robert Haas wrote:
> On Wed, Aug 1, 2012 at 4:28 AM, Fabien COELHO  wrote:
> > Dear PostgreSQL developers,
> >
> > Plese find attached a patch so that:
> >
> > Make "psql -1 < file.sql" work as with "-f"
> >
> > Make psql --single-transaction option work on a non-interactive
> > standard input as well, so that "psql -1 < input.sql" behaves as
> > "psql -1 -f input.sql".
> >
> > This saner/less error-prone behavior was discussed in this thread back in
> > June:
> >
> > http://archives.postgresql.org/pgsql-hackers/2012-06/msg00785.php
> >
> > I have tested it manually and it works for me. I'm not sure this is the best
> > possible implementation, but it is a small diff one. I haven't found a place
> > in the regression tests where "psql" could be tested with different options.
> > Did I miss something?
> 
> I'm wondering if perhaps -- in addition to what you've done here -- we
> should make "psql -1" error out if reading from a terminal.

+1 for this.

> Because accepting options that are intended to cause important
> behavior changes and then ignoring those options is Bad.

Yes, It Is.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Tom Lane
Alvaro Herrera  writes:
> Excerpts from Bruce Momjian's message of mié ago 08 17:15:38 -0400 2012:
>> On Wed, Aug  8, 2012 at 04:23:04PM -0400, Robert Haas wrote:
>>> I think this is one good idea:
>>> http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us

>> If we currently require 14 steps to use pg_upgrade, how would that
>> reduce this number?  What failures does it fix?

> The suggestion by Tom reduces the list by two steps because it doesn't
> need to adjust pg_hba.conf or put it back in the original way
> afterwards.

Even more to the point, it flat-out eliminates failure modes associated
with somebody connecting to either the old or the new cluster while
pg_upgrade is working.  Schemes that involve temporarily hacking
pg_hba.conf can't provide any iron-clad guarantee, because if pg_upgrade
can connect to a postmaster, so can somebody else.

The point I think Robert was trying to make is that we need to cut down
not only the complexity of running pg_upgrade, but the number of failure
modes.  At least that's how I'd define improvement here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Jaime Casanova
On Wed, Aug 8, 2012 at 4:29 PM, Alvaro Herrera  wrote:
>
> I wonder if things would be facilitated by having a config file for
> pg_upgrade to specify binary and PGDATA paths instead of having awkward
> command line switches.  That way you could request the user to create
> such a file, then
>

i like this idea, when i have used pg_upgrade i have been running it
several times with --check until everything is ok and then the actual
upgrade... so i always create a shell script to run it, a config file
should be a good idea

>
> You could even have a mode on which pg_upgrade asks for the necessary
> information to create the config file.  For example
>
> pg_upgrade --create-config=/somewhere/pg_upgrade.conf
> Path to new binaries:  /usr/local/pg92
> Path to old binaries:  /usr/local/pg84
> Path to old data directory:  /srv/pgsql-84/data
> Path to new data directory:  /srv/pgsql-92/data
> Use link mode (y/N):  n
> ... using copy mode.
> [now run the checks and complain about missing binaries, etc]
>

even better

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of mié ago 08 17:15:38 -0400 2012:
> On Wed, Aug  8, 2012 at 04:23:04PM -0400, Robert Haas wrote:
> > On Tue, Aug 7, 2012 at 10:59 AM, Bruce Momjian  wrote:
> > > Yes, the list of rough edges is the 14-steps you have to perform to run
> > > pg_upgrade, as documented in the pg_upgrade manual page:
> > >
> > > http://www.postgresql.org/docs/9.2/static/pgupgrade.html
> > >
> > > The unknown is how to reduce the number of steps in a way the community
> > > would find acceptable.
> > 
> > I think this is one good idea:
> > 
> > http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us
> > 
> > The number of steps is an issue, but the likelihood of the actual
> > pg_upgrade run failing or doing the wrong thing is also something we
> > need to work on.
> 
> If we currently require 14 steps to use pg_upgrade, how would that
> reduce this number?  What failures does it fix?

I think those 14 is a bit of a made-up number.  Several of those steps
are about building pg_upgrade, not actually using it.  And there are
some that are optional anyway.

The suggestion by Tom reduces the list by two steps because it doesn't
need to adjust pg_hba.conf or put it back in the original way
afterwards.

Another thing worth considering is to have pg_upgrade init, stop and
start clusters as necessary instead of requesting the user to do it.
I think this is two less steps.


I wonder if things would be facilitated by having a config file for
pg_upgrade to specify binary and PGDATA paths instead of having awkward
command line switches.  That way you could request the user to create
such a file, then

# this runs the checks, gathers necessary .so files, stops old cluster,
# runs initdb for new cluster
pg_upgrade --config=/somewhere/pg_upgrade.conf --init-new 

# now plead the user to install the .so files into the new cluster

# do the actual upgrade
pg_upgrade --config=/somewhere/pg_upgrade.conf --actually-upgrade


You could even have a mode on which pg_upgrade asks for the necessary
information to create the config file.  For example

pg_upgrade --create-config=/somewhere/pg_upgrade.conf
Path to new binaries:  /usr/local/pg92
Path to old binaries:  /usr/local/pg84
Path to old data directory:  /srv/pgsql-84/data
Path to new data directory:  /srv/pgsql-92/data
Use link mode (y/N):  n
... using copy mode.
[now run the checks and complain about missing binaries, etc]

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Bruce Momjian
On Wed, Aug  8, 2012 at 04:23:04PM -0400, Robert Haas wrote:
> On Tue, Aug 7, 2012 at 10:59 AM, Bruce Momjian  wrote:
> > Yes, the list of rough edges is the 14-steps you have to perform to run
> > pg_upgrade, as documented in the pg_upgrade manual page:
> >
> > http://www.postgresql.org/docs/9.2/static/pgupgrade.html
> >
> > The unknown is how to reduce the number of steps in a way the community
> > would find acceptable.
> 
> I think this is one good idea:
> 
> http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us
> 
> The number of steps is an issue, but the likelihood of the actual
> pg_upgrade run failing or doing the wrong thing is also something we
> need to work on.

If we currently require 14 steps to use pg_upgrade, how would that
reduce this number?  What failures does it fix?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make "psql -1 < file.sql" work as with "-f"

2012-08-08 Thread Robert Haas
On Wed, Aug 1, 2012 at 4:28 AM, Fabien COELHO  wrote:
> Dear PostgreSQL developers,
>
> Plese find attached a patch so that:
>
> Make "psql -1 < file.sql" work as with "-f"
>
> Make psql --single-transaction option work on a non-interactive
> standard input as well, so that "psql -1 < input.sql" behaves as
> "psql -1 -f input.sql".
>
> This saner/less error-prone behavior was discussed in this thread back in
> June:
>
> http://archives.postgresql.org/pgsql-hackers/2012-06/msg00785.php
>
> I have tested it manually and it works for me. I'm not sure this is the best
> possible implementation, but it is a small diff one. I haven't found a place
> in the regression tests where "psql" could be tested with different options.
> Did I miss something?

I'm wondering if perhaps -- in addition to what you've done here -- we
should make "psql -1" error out if reading from a terminal.

Because accepting options that are intended to cause important
behavior changes and then ignoring those options is Bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -Wformat-zero-length

2012-08-08 Thread Robert Haas
On Tue, Aug 7, 2012 at 10:59 AM, Bruce Momjian  wrote:
> Yes, the list of rough edges is the 14-steps you have to perform to run
> pg_upgrade, as documented in the pg_upgrade manual page:
>
> http://www.postgresql.org/docs/9.2/static/pgupgrade.html
>
> The unknown is how to reduce the number of steps in a way the community
> would find acceptable.

I think this is one good idea:

http://archives.postgresql.org/message-id/29806.1340655...@sss.pgh.pa.us

The number of steps is an issue, but the likelihood of the actual
pg_upgrade run failing or doing the wrong thing is also something we
need to work on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Simon Riggs
On 8 August 2012 20:34, Robert Haas  wrote:
> On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs  wrote:
>> Incidentally, we can also optimise repeated inserts within a normal
>> transaction using this method, by implementing deferred unique
>> constraints. At present we say that unique constraints aren't
>> deferrable, but there's no reason they can't be, if we allow buffering
>> in the index. (Implementing a deferred constraint that won't fail if
>> we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo,
>> which is probably a bad plan, plus you'd need to sort the inputs, so
>> that particular nut is another issue altogether, AFAICS).
>
> We've had deferrable unique constraints since 9.0, courtesy of Dean Rasheed.

Yeh, but IIRC there was some issue I can't seem to find detail on
about it not working quite right in production. Not sure now.


>> I think we need to implement buffering both to end of statement or end
>> of transaction, not just one or the other.
>
> Another (not necessarily better) idea is to use a buffer that's part
> of the index, like the GIN fastupdate stuff, so that there's no
> particular constraint on when the buffer has to be flushed, but
> competing index scans may be slower until it is.

I think that works very well for non-unique indexes, though it does
increase WAL traffic since you need to do a double hop into the index.
Its not possible for unique constraints/pk indexes since they need to
fail the transaction in case of duplicates.

The buffer I was imagining would be a private buffer within a
transaction, so wouldn't increase WAL.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Robert Haas
On Tue, Aug 7, 2012 at 4:52 PM, Simon Riggs  wrote:
> Incidentally, we can also optimise repeated inserts within a normal
> transaction using this method, by implementing deferred unique
> constraints. At present we say that unique constraints aren't
> deferrable, but there's no reason they can't be, if we allow buffering
> in the index. (Implementing a deferred constraint that won't fail if
> we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo,
> which is probably a bad plan, plus you'd need to sort the inputs, so
> that particular nut is another issue altogether, AFAICS).

We've had deferrable unique constraints since 9.0, courtesy of Dean Rasheed.

> I think we need to implement buffering both to end of statement or end
> of transaction, not just one or the other.

Another (not necessarily better) idea is to use a buffer that's part
of the index, like the GIN fastupdate stuff, so that there's no
particular constraint on when the buffer has to be flushed, but
competing index scans may be slower until it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git diff patch in context diff format

2012-08-08 Thread Andrew Dunstan


On 08/08/2012 01:29 PM, Bruce Momjian wrote:

On Thu, Aug  2, 2012 at 05:03:04PM +0800, Qi Huang wrote:

Hi, hackers
 I was exporting my project to a patch file. As the patch review requires,
the patch needs to be in context diff format (http://wiki.postgresql.org/wiki/
Reviewing_a_Patch). But the git diff exports in a format similar to unified
format. What is everyone doing with patching now? Is there any standard way?

Have you read our wiki about git and diffs?

http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git




I must confess that, like Robert, I just use:

   git diff | filterdiff --format=context

I tried the git config stuff mentioned on the wiki, and it bit me a few 
times, I forget exactly how, and this just works without hassle.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of pg_trgm?

2012-08-08 Thread Tom Lane
... btw, I think there is another problem here, which is that
generate_wildcard_trgm will restart get_wildcard_part at the
same place that the second loop exits, which means it would
do the wrong thing if what it returns is a pointer to the
second char of an escape pair.  Consider for instance

foo\\%bar

The first call of get_wildcard_part will correctly extract "foo",
but then return a pointer to the second backslash.  So the second
call will think that the % is escaped, which it is not, leading to
a wrong decision about whether to pad "bar".

Probably a minimal fix for this could be made by backing up "endword"
one byte before returning it if in_escape is true when the second
loop exits.  That would not scale up to preserving the state of
in_wildcard_meta, but since the second loop never advances past a
meta char, that's okay for the moment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoid unnecessary failure to open restored WAL files

2012-08-08 Thread Fujii Masao
On Wed, Aug 8, 2012 at 3:08 AM, Simon Riggs  wrote:
> On 2 August 2012 17:18, Fujii Masao  wrote:
>> Hi,
>>
>> In HEAD and 9.2, the following scenario happens in archive recovery.
>>
>> 1. The archived WAL file is restored onto the temporary file name
>> "RECOVERYXLOG".
>> 2. The restored WAL file is renamed to the correct file name like
>> 00010002.
>> 3. The startup process tries to open the temporary file even though
>> it's already been renamed
>> and doesn't exist. This always fails.
>> 4. The startup process retries to open the correct file as a WAL file
>> in pg_xlog directory instead
>> of the archived file. This succeeds.
>>
>> The above failure of file open is unnecessary, so I think we can avoid
>> that. Attached patch
>> changes the startup process so that it opens the correct restored WAL
>> file after restoring the
>> archived WAL file.
>
> Looks to me that the strncpy is backwards and will still fail.  Please
> double check.

Oh, you're right. I wrongly placed two arguments "source" and "destination"
of strncpy in the reverse order... Attached is the updated version of the patch.

Regards,

-- 
Fujii Masao


file_open_failure_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git diff patch in context diff format

2012-08-08 Thread Bruce Momjian
On Thu, Aug  2, 2012 at 05:03:04PM +0800, Qi Huang wrote:
> Hi, hackers
> I was exporting my project to a patch file. As the patch review requires,
> the patch needs to be in context diff format (http://wiki.postgresql.org/wiki/
> Reviewing_a_Patch). But the git diff exports in a format similar to unified
> format. What is everyone doing with patching now? Is there any standard way? 

Have you read our wiki about git and diffs?

http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git


-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of pg_trgm?

2012-08-08 Thread Tom Lane
Fujii Masao  writes:
> When I used pg_trgm, I encountered the problem that the search result of
> SeqScan was the different from that of BitmapScan even if the search
> keyword was the same. Is this a bug?

Surely.

> The cause is ISTM that pg_trgm wrongly ignores the heading wildcard
> character (i.e., %) when an escape (i.e., \\) follows the wildcard character.
> Attached patch fixes this.

This patch doesn't seem quite right to me, though.  I agree that given
'%\x...', we should exit the loop with in_wildcard_meta still true.
However, if we have say '%\+...', the loop will continue, and now we
must reset in_wildcard_meta, no?  The next character is not adjacent to
a meta.  So I think in the "if (in_escape)" block, *both* assignments
should be moved after the iswordchr check.  Is there something I'm
missing?

Also, shouldn't we make a similar change in the second loop?  I guess
it's not strictly necessary given that that loop will exit as soon as
it sets in_wildcard_meta, but if you want to depend on that then the
reset in the second "if (in_escape)" block is altogether useless.  It
seems better to keep the logic of the two loops as similar as possible.

I'm also inclined to think that we should remove *both* flag resets
before the second loop.  The logic here is that we are reprocessing
the same character seen in the last iteration of the first loop,
right?  So the flag state ought to remain the same.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP fix proposal for bug #6123

2012-08-08 Thread Bruce Momjian
On Wed, Aug  8, 2012 at 09:26:41AM -0500, Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > Did we ever decide on this?
>  
> We discussed it to the point of consensus, and Tom wrote a patch to
> implement that.  Testing in my shop hit problems for which the cause
> was not obvious.  I don't know whether there is a flaw in the
> designed approach that we all missed, a simple programming bug of
> some sort in the patch, or pilot error on this end.  It's on my list
> of things to do, but it's hovering around 4th place on that list,
> and new things seem to be appearing at the top of that list at about
> the rate that I can clear them.  :-(
>  
> > Is it a TODO?
>  
> We don't want to lose track of it, but with a pending patch to
> debug, I'm not sure the TODO list is the right place to put it.

OK, I will let it linger on your TODO list then.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for LATERAL subqueries

2012-08-08 Thread Merlin Moncure
On Tue, Aug 7, 2012 at 6:08 PM, Tom Lane  wrote:
> I wrote:
>> What I'd like to do next, barring objections, is to band-aid the places
>> where the planner could crash on a LATERAL query (probably just make it
>> throw FEATURE_NOT_SUPPORTED errors), write some documentation, and
>> then commit what I've got.  After that I can go back to improve the
>> planner and work on the parser refactoring issues as separate patches.
>
> ... and done (though the pgsql-committers message seems to have got hung
> up for moderation).  I put some simplistic examples into section 7.2.1.5
> and the SELECT reference page ... if anybody has ideas for
> more-compelling small examples, please speak up.

This is just awesome.  Anyways, I was looking around the docs for
references to the old methodology of select list SRF function calls.
This paragraph:
http://www.postgresql.org/docs/devel/static/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET

could probably use some enhancement describing best practices in a
LATERAL world and more examples of dealing with set returning
functions in general.  I also noticed that the build in SRF page
(http://www.postgresql.org/docs/devel/static/functions-srf.html) lies
with the comment "This section describes functions that possibly
return more than one row. Currently the only functions in this class
are series generating functions" since at minimum we have 'unnest' so
that page could use some wordsmithing as well.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible bug in PostgreSQL 9.2 stable: TwoPhaseGetDummyBackendId()

2012-08-08 Thread Tom Lane
Robert Ross  writes:
> I have looked at the Postgres 9.2 stable and Postgres 9.2 beta 3 git  
> archives and this bug still appears to be present.

> TwoPhaseGetDummyProc returns a PGPROC*. In 9.0, it was safe for  
> TwoPhaseGetDummyBackendId() to cast this to a GlobalTransaction  
> because the GlobalTransactionData structure's first element was always  
> a PGPROC structure. However, in 9.2 this is no longer true.

Clearly an oversight in commit ed0b409d22346b1b027a4c2099ca66984d94b6dd
:-(.  Will fix, thanks for the report!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Tom Lane
Heikki Linnakangas  writes:
> On 08.08.2012 12:36, Jim Vanns wrote:
>> I suggest then that the documentation is updated to reflect this? Anf
>> again, perhaps the 'int' for nParams should be an int16_t or short?

> I don't think we should change the function signature for this, but I 
> think a sanity check for "nParams < 32768" in libpq would be in order.

We *can't* change the function signature like that.  In the first place,
it would be an ABI break necessitating a bump in the .so major version
number, which would cause pain vastly out of proportion to the size of
this problem.  In the second place, if we did that, then if someone made
the same mistake and tried to pass umpteen thousand parameters to a
statement, the same truncation Jim is complaining of would still happen.
Only this way, it would happen silently inside the C function call
mechanism, such that neither the application nor libpq could detect the
problem.

A runtime check for too many parameters seems appropriate.  Assuming
that the error message it throws is well written, I don't think we need
to adjust the documentation.  There are many limitations that are not
spelled out in the docs, and adding every one of them would not amount
to a net improvement.  Considering that Jim is the first to try this in
however many years it's been since 7.4, I don't think that everybody
needs to read about this restriction while they're trying to absorb what
PQexecParams does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP fix proposal for bug #6123

2012-08-08 Thread Kevin Grittner
Bruce Momjian  wrote:
 
> Did we ever decide on this?
 
We discussed it to the point of consensus, and Tom wrote a patch to
implement that.  Testing in my shop hit problems for which the cause
was not obvious.  I don't know whether there is a flaw in the
designed approach that we all missed, a simple programming bug of
some sort in the patch, or pilot error on this end.  It's on my list
of things to do, but it's hovering around 4th place on that list,
and new things seem to be appearing at the top of that list at about
the rate that I can clear them.  :-(
 
> Is it a TODO?
 
We don't want to lose track of it, but with a pending patch to
debug, I'm not sure the TODO list is the right place to put it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Magnus Hagander
On Wed, Aug 8, 2012 at 1:24 PM, Heikki Linnakangas
 wrote:
> On 08.08.2012 12:36, Jim Vanns wrote:
>>
>> Ah ha. Yes, you're correct. It does mention here that an Int16 is used
>> to specify the number of parameter format codes, values etc.
>>
>> I suggest then that the documentation is updated to reflect this? Anf
>> again, perhaps the 'int' for nParams should be an int16_t or short?
>
>
> I don't think we should change the function signature for this, but I think
> a sanity check for "nParams < 32768" in libpq would be in order.

+1 - and also a clear update to the documentation.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Jim Vanns
On Wed, 2012-08-08 at 14:24 +0300, Heikki Linnakangas wrote:
> On 08.08.2012 12:36, Jim Vanns wrote:
> > Ah ha. Yes, you're correct. It does mention here that an Int16 is used
> > to specify the number of parameter format codes, values etc.
> >
> > I suggest then that the documentation is updated to reflect this? Anf
> > again, perhaps the 'int' for nParams should be an int16_t or short?
> 
> I don't think we should change the function signature for this, but I 
> think a sanity check for "nParams < 32768" in libpq would be in order.

While I agree that perhaps changing the function signature is a little
too intrusive considering it's been that way for a long long time (I
would wager) , I do think that yes, there should be a sanity check but
more importantly the documentation should explicitly state the
limitation or restriction despite the parameter type is a 4 byte
integer. Otherwise people like myself will assume that all 4 bytes can
be used ;)

Regards,

Jim

> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
> 

-- 
Jim Vanns
Systems Programmer
Framestore


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Heikki Linnakangas

On 08.08.2012 12:36, Jim Vanns wrote:

Ah ha. Yes, you're correct. It does mention here that an Int16 is used
to specify the number of parameter format codes, values etc.

I suggest then that the documentation is updated to reflect this? Anf
again, perhaps the 'int' for nParams should be an int16_t or short?


I don't think we should change the function signature for this, but I 
think a sanity check for "nParams < 32768" in libpq would be in order.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Jim Vanns
Ah ha. Yes, you're correct. It does mention here that an Int16 is used
to specify the number of parameter format codes, values etc. 

I suggest then that the documentation is updated to reflect this? Anf
again, perhaps the 'int' for nParams should be an int16_t or short?

Naturally I have already modified our code to batch or chunk rather
large numbers of parameters to fit within this restriction but I do
think it'd help others if the API interface reflected the same size data
types as the restricted back ends ;)

Thanks Dmitriy,

Jim

On Wed, 2012-08-08 at 13:27 +0400, Dmitriy Igrishin wrote:
> Hey Jim,
> 
> 2012/8/8 Jim Vanns 
> Hello PG hackers. Yesterday I began diagnosing a peculiar bug
> in some
> production code that has been happily running for months. I
> finally got
> to the bottom of it despite the rather misleading error
> message. Anyway,
> within a section of code we are making a DELETE call to the
> database via
> the libpq call PQexecParams(). It failed with this message:
> 
> 'ERROR:  bind message has 32015 parameter formats but 1
> parameters'
> 
> This was just plain wrong. In fact, the # of parameters was
> more like
> 80,000. The area of code is quite clear. Despite this being a
> particularly large number of parameters (as you can imagine
> this query
> is built dynamically based on arbitrarily sized input) the
> data type for
> nParams for is a plain old 4-byte int. Upon further and deeper
> inspection I find that this 4 byte int is truncated to two
> bytes just
> before going down the wire.
> 
> There is no mention of any restriction in the 9.1.4
> documentation:
> 
> http://www.postgresql.org/docs/9.1/static/libpq-exec.html
> 
> And the interface quite clearly accepts a 4 byte int however,
> the PQsendQueryGuts()
> function on line 1240 of src/interfaces/libpq/fq-exec.c just
> blatantly truncates the
> integer - it's calls pqPutInt() for nParams with a literal 2
> rather than 4. It does this
> several times, in fact.
> 
> Unless I'm barking mad, surely this should either
> 
> a) Be fixed and send 4 with nParams for pqPutInt() rather than
> 2
> b) Documented (and the type changed) as only being a 2 byte
> int
>and therefore having a restriction on the number of
> parameters
>permitted in PQexecParams().
> 
> Could someone either verify or correct me before I submit an
> official bug report!?
> 
> Regards,
> 
> Jim Vanns
> AFAIK, this is a limitation of the frontend/backend protocol which
> allows
> you to bind no more than 2^16 parameters.
> See the description of the Bind (F) message here
> http://www.postgresql.org/docs/9.2/static/protocol-message-formats.html
> 
> 
> -- 
> // Dmitriy.
> 
> 

-- 
Jim Vanns
Systems Programmer
Framestore


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Dmitriy Igrishin
Hey Jim,

2012/8/8 Jim Vanns 

> Hello PG hackers. Yesterday I began diagnosing a peculiar bug in some
> production code that has been happily running for months. I finally got
> to the bottom of it despite the rather misleading error message. Anyway,
> within a section of code we are making a DELETE call to the database via
> the libpq call PQexecParams(). It failed with this message:
>
> 'ERROR:  bind message has 32015 parameter formats but 1 parameters'
>
> This was just plain wrong. In fact, the # of parameters was more like
> 80,000. The area of code is quite clear. Despite this being a
> particularly large number of parameters (as you can imagine this query
> is built dynamically based on arbitrarily sized input) the data type for
> nParams for is a plain old 4-byte int. Upon further and deeper
> inspection I find that this 4 byte int is truncated to two bytes just
> before going down the wire.
>
> There is no mention of any restriction in the 9.1.4 documentation:
>
> http://www.postgresql.org/docs/9.1/static/libpq-exec.html
>
> And the interface quite clearly accepts a 4 byte int however, the
> PQsendQueryGuts()
> function on line 1240 of src/interfaces/libpq/fq-exec.c just blatantly
> truncates the
> integer - it's calls pqPutInt() for nParams with a literal 2 rather than
> 4. It does this
> several times, in fact.
>
> Unless I'm barking mad, surely this should either
>
> a) Be fixed and send 4 with nParams for pqPutInt() rather than 2
> b) Documented (and the type changed) as only being a 2 byte int
>and therefore having a restriction on the number of parameters
>permitted in PQexecParams().
>
> Could someone either verify or correct me before I submit an official bug
> report!?
>
> Regards,
>
> Jim Vanns
>
AFAIK, this is a limitation of the frontend/backend protocol which allows
you to bind no more than 2^16 parameters.
See the description of the Bind (F) message here
http://www.postgresql.org/docs/9.2/static/protocol-message-formats.html

-- 
// Dmitriy.


[HACKERS] Bug in libpq implentation and omission in documentation?

2012-08-08 Thread Jim Vanns
Hello PG hackers. Yesterday I began diagnosing a peculiar bug in some
production code that has been happily running for months. I finally got
to the bottom of it despite the rather misleading error message. Anyway,
within a section of code we are making a DELETE call to the database via
the libpq call PQexecParams(). It failed with this message:

'ERROR:  bind message has 32015 parameter formats but 1 parameters'

This was just plain wrong. In fact, the # of parameters was more like
80,000. The area of code is quite clear. Despite this being a
particularly large number of parameters (as you can imagine this query
is built dynamically based on arbitrarily sized input) the data type for
nParams for is a plain old 4-byte int. Upon further and deeper
inspection I find that this 4 byte int is truncated to two bytes just
before going down the wire.

There is no mention of any restriction in the 9.1.4 documentation:

http://www.postgresql.org/docs/9.1/static/libpq-exec.html

And the interface quite clearly accepts a 4 byte int however, the 
PQsendQueryGuts()
function on line 1240 of src/interfaces/libpq/fq-exec.c just blatantly 
truncates the
integer - it's calls pqPutInt() for nParams with a literal 2 rather than 4. It 
does this
several times, in fact.

Unless I'm barking mad, surely this should either

a) Be fixed and send 4 with nParams for pqPutInt() rather than 2
b) Documented (and the type changed) as only being a 2 byte int
   and therefore having a restriction on the number of parameters
   permitted in PQexecParams().

Could someone either verify or correct me before I submit an official bug 
report!?

Regards,

Jim Vanns

-- 
Jim Vanns
Systems Programmer
Framestore


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inserting heap tuples in bulk in COPY

2012-08-08 Thread Simon Riggs
On 8 August 2012 03:44, Jeff Janes  wrote:
> On Tue, Aug 7, 2012 at 1:52 PM, Simon Riggs  wrote:
>> On 7 August 2012 20:58, Jeff Janes  wrote:
>>> Hi Heikki,
>>>
>>> Is the bulk index insert still an active area for you?
>>>
>>> If not, is there some kind of summary of design or analysis work
>>> already done, which would be a useful for someone else interested in
>>> picking it up?
>>
>> The main cost comes from repeated re-seeking down the index tree to
>> find the insertion point, but repeated lock and pin operations on
>> index buffers are also high. That is optimisable if the index inserts
>> are grouped, as they will be with monotonic indexes, (e.g. SERIAL), or
>> with partial monotonic (i.e. with Parent/Child relationship, on Child
>> table many inserts will be of the form (x,1), (x,2), (x, 3) etc, e.g.
>> Order/OrderLine).
>>
>> All we need do is buffer the inserts in the inserts, before inserting
>> them into the main index. As long as we flush the buffer before end of
>> transaction, we're good.
>>
>> Incidentally, we can also optimise repeated inserts within a normal
>> transaction using this method, by implementing deferred unique
>> constraints. At present we say that unique constraints aren't
>> deferrable, but there's no reason they can't be, if we allow buffering
>> in the index. (Implementing a deferred constraint that won't fail if
>> we do UPDATE foo SET pk = pk+1 requires a buffer the size of foo,
>> which is probably a bad plan, plus you'd need to sort the inputs, so
>> that particular nut is another issue altogether, AFAICS).
>>
>> Suggested implementation is to buffer index tuples at the generic
>> index API, then implement a bulk insert index API call that can then
>> be implemented for each AM separately. Suggested buffer size is
>> work_mem. Note we must extract index tuple from heap tuples -
>> refetching heap blocks to get rows is too costly.
>
> OK, thanks for the summary.  It looks like your plans are to improve
> the case where the index maintenance is already CPU limited.  I was
> more interested in cases where the regions of the index(es) undergoing
> active insertion do not fit into usable RAM, so the limit is the
> random IO needed to fetch the leaf pages in order to update them or to
> write them out once dirtied.  Here too buffering is probably the
> answer, but the size of the buffer needed to turn that IO from
> effectively random to effectively sequential is probably much larger
> than the size of the buffer you are contemplating.

The buffer size can be variable, yes. I was imagining a mechanism that
worked for normal INSERTs as well as COPY. Perhaps we could say buffer
is work_mem with INSERT and maintenance_work_mem with COPY.

Very large index appends are useful, but currently not very easily
usable because of the transactional nature of COPY. If we could reject
rows without ERROR it would be more practical.

I'm not planning to work on this, so all comments for your assistance.

>> I think we need to implement buffering both to end of statement or end
>> of transaction, not just one or the other.
>
> With the planner deciding which would be better, or explicit user action?

Probably both: on/off/choose.

Deferring unique check would change the point at which errors were
reported in a transaction, which might not be desirable for some. I
think SQL standard has something to say about this also, so that needs
care. But in general, if your tables use generated PK values they
should be able to benefit from this, so I would suggest a default
setting of choose.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers