date:20121218

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Simon Riggs

On 18 December 2012 02:21, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2012-12-17 at 19:14 +, Simon Riggs wrote:
 We'll need a way of expressing some form of corruption tolerance.
 zero_damaged_pages is just insane,

 The main problem I see with zero_damaged_pages is that it could
 potentially write out the zero page, thereby really losing your data if
 it wasn't already lost. (Of course, we document that you should have a
 backup first, but it's still dangerous). I assume that this is the same
 problem you are talking about.

I think we should discuss whether we accept my premise? Checksums will
actually detect more errors than we see now, and people will want to
do something about that. Returning to backup is one way of handling
it, but on a busy production system with pressure on, there is
incentive to implement a workaround, not a fix. It's not an easy call
to say we've got 3 corrupt blocks, so I'm going to take the whole
system offline while I restore from backup.

If you do restore from backup, and the backup also contains the 3
corrupt blocks, what then?

Clearly part of the response could involve pg_dump on the damaged
structure, at some point.

 I suppose we could have a new ReadBufferMaybe function that would only
 be used by a sequential scan; and then just skip over the page if it's
 corrupt, depending on a GUC. That would at least allow sequential scans
 to (partially) work, which might be good enough for some data recovery
 situations. If a catalog index is corrupted, that could just be rebuilt.
 Haven't thought about the details, though.

Not sure if you're being facetious here or not. Mild reworking of the
logic for heap page access could cope with a NULL buffer response and
subsequent looping, which would allow us to run pg_dump against a
damaged table to allow data to be saved, keeping file intact for
further analysis.

I'm suggesting we work a little harder than your block is corrupt
and give some thought to what the user will do next. Indexes are a
good case, because we can/should report the block error, mark the
index as invalid and then hint that it should be rebuilt.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Bernhard Schrader


On 12/18/2012 02:41 AM, Bruce Momjian wrote:

On Mon, Dec 17, 2012 at 12:14:29PM +0100, Bernhard Schrader wrote:

Hello together,

last thursday I upgraded one of our 9.0.6 postgresql servers to
9.2.2 with pg_upgrade. So far everything seemed to work but we now
discover problems with the enum types. If we run one specific query
it breaks all time with such an error message:

ERROR: invalid internal value for enum: 520251

if this number should represent the enumtypid it is not existing
anymore in pg_enum.

How could i solve this problem? should we regenerate all enums? or
what could we do?
Hopefully anyone has a clue, google doesn't seem to be the ressource
for this problem.

We seriously tested the enum code so I am pretty confused why this is
failing.  If you do pg_dump --binary-upgrade --schema-only, do you see
that a number like this being defined just before the enum is added?


Hi Bruce,

if i am dumping this db and grepping through the dump, i can't find the 
number.

As far as we can see, the enum that is affected has now the enumtypid 16728.

is there a table which keeps the possible typecasts from enum to 
text/text to enum etc.? if so, maybe the mapping in here is corrupt 
since the upgrade.


regards

--
Bernhard Schrader
System Administration

InnoGames GmbH
Harburger Schloßstraße 28 (Channel 4) - 21079 Hamburg - Germany
Tel +49 40 7889335-53
Fax +49 40 7889335-22

Managing Directors: Hendrik Klindworth, Eike Klindworth, Michael Zillmer
VAT-ID: DE264068907 Amtsgericht Hamburg, HRB 108973

http://www.innogames.com – bernhard.schra...@innogames.de




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-12-18 Thread Alexander Korotkov

On Tue, Dec 18, 2012 at 11:45 AM, Erik Rijkers e...@xs4all.nl wrote:

 On Tue, December 18, 2012 08:04, Alexander Korotkov wrote:
 I ran the same test again: HEAD versus trgm_regex v6, 7 and 9.  In v9
 there is some gain but also
 some regression.

 It remains a difficult problem...

 If I get some time in the holidays I'll try to diversify the test program;
 it is now too simple.


Note, that regexes which contains {,n} are likely not what do you expect.

test=# select 'xq' ~ 'x[aeiou]{,2}q';
 ?column?
--
 f
(1 row)

test=# select 'xa{,2}q' ~ 'x[aeiou]{,2}q';
 ?column?
--
 t
(1 row)

You should use {0,n} to express from 0 to n occurences.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-12-18 Thread Erik Rijkers

On Tue, December 18, 2012 09:45, Alexander Korotkov wrote:

 You should use {0,n} to express from 0 to n occurences.



Thanks, but I know that of course.  It's a testing program; and in the end 
robustness with
unexpected or even wrong input is as important as performance.  (to put it 
bluntly, I am also
trying to get your patch to fall over ;-))

Thanks,

Erik Rijkers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Greg Smith


On 12/18/12 3:17 AM, Simon Riggs wrote:

Clearly part of the response could involve pg_dump on the damaged
structure, at some point.


This is the main thing I wanted to try out more, once I have a decent 
corruption generation tool.  If you've corrupted a single record but can 
still pg_dump the remainder, that seems the best we can do to help 
people recover from that.  Providing some documentation on how to figure 
out what rows are in that block, presumably by using the contrib 
inspection tools, would be helpful too.



Indexes are a good case, because we can/should report the block error, mark the
index as invalid and then hint that it should be rebuilt.


Marking a whole index invalid because there's one bad entry has enough 
downsides that I'm not sure how much we'd want to automate that.  Not 
having that index available could easily result in an effectively down 
system due to low performance.  The choices are uglier if it's backing a 
unique constraint.


In general, what I hope people will be able to do is switch over to 
their standby server, and then investigate further.  I think it's 
unlikely that people willing to pay for block checksums will only have 
one server.  Having some way to nail down if the same block is bad on a 
given standby seems like a useful interface we should offer, and it 
shouldn't take too much work.  Ideally you won't find the same 
corruption there.  I'd like a way to check the entirety of a standby for 
checksum issues, ideally run right after it becomes current.  It seems 
the most likely way to see corruption on one of those is to replicate a 
corrupt block.


There is no good way to make the poor soul who has no standby server 
happy here.  You're just choosing between bad alternatives.  The first 
block error is often just that--the first one, to be joined by others 
soon afterward.  My experience at how drives fail says the second error 
is a lot more likely after you've seen one.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-12-18 Thread Alexander Korotkov

On Tue, Dec 18, 2012 at 12:51 PM, Erik Rijkers e...@xs4all.nl wrote:

 On Tue, December 18, 2012 09:45, Alexander Korotkov wrote:
 
  You should use {0,n} to express from 0 to n occurences.
 


 Thanks, but I know that of course.  It's a testing program; and in the end
 robustness with
 unexpected or even wrong input is as important as performance.  (to put it
 bluntly, I am also
 trying to get your patch to fall over ;-))


I found most of regressions in 0.9 version to be in {,n} cases. New version
of patch use more of trigrams than previous versions.
For example for regex 'x[aeiou]{,2}q'.
In 0.7 version we use trigrams '__2', '_2_' and '__q'.
In 0.9 version we use trigrams 'xa_', 'xe_', 'xi_', 'xo_', 'xu_', '__2',
'_2_' and '__q'.

But, actually trigram '__2' or '_2_' never occurs. It enough to have one of
them, all others are just causing a slowdown. Simultaneously, we can't
decide reasonably which trigrams to use without knowing their frequencies.
For example, if trigrams 'xa_', 'xe_', 'xi_', 'xo_', 'xu_' were altogether
more rare than '__2', newer version of patch would be faster.


--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Error restoring from a base backup taken from standby

2012-12-18 Thread Heikki Linnakangas


On 18.12.2012 00:35, Simon Riggs wrote:

On 17 December 2012 17:39, Heikki Linnakangashlinnakan...@vmware.com  wrote:

(This is different from the other issue related to timeline switches I just
posted about. There's no timeline switch involved in this one.)

If you do pg_basebackup -x against a standby server, in some circumstances
the backup fails to restore with error like this:

C 2012-12-17 19:09:44.042 EET 7832 LOG:  database system was not properly
shut down; automatic recovery in progress
C 2012-12-17 19:09:44.091 EET 7832 LOG:  record with zero length at
0/1764F48
C 2012-12-17 19:09:44.091 EET 7832 LOG:  redo is not required
C 2012-12-17 19:09:44.091 EET 7832 FATAL:  WAL ends before end of online
backup
C 2012-12-17 19:09:44.091 EET 7832 HINT:  All WAL generated while online
backup was taken must be available at recovery.
C 2012-12-17 19:09:44.092 EET 7831 LOG:  startup process (PID 7832) exited
with exit code 1
C 2012-12-17 19:09:44.092 EET 7831 LOG:  aborting startup due to startup
process failure

I spotted this bug while reading the code, and it took me quite a while to
actually construct a test case to reproduce the bug, so let me begin by
discussing the code where the bug is. You get the above error, WAL ends
before end of online backup, when you reach the end of WAL before reaching
the backupEndPoint stored in the control file, which originally comes from
the backup_label file. backupEndPoint is only used in a base backup taken
from a standby, in a base backup taken from the master, the end-of-backup
WAL record is used instead to mark the end of backup. In the xlog redo loop,
after replaying each record, we check if we've just reached backupEndPoint,
and clear it from the control file if we have. Now the problem is, if there
are no WAL records after the checkpoint redo point, we never even enter the
redo loop, so backupEndPoint is not cleared even though it's reached
immediately after reading the initial checkpoint record.

To deal with the similar situation wrt. reaching consistency for hot standby
purposes, we call CheckRecoveryConsistency() before the redo loop. The
straightforward fix is to copy-paste the check for backupEndPoint to just
before the redo loop, next to the CheckRecoveryConsistency() call. Even
better, I think we should move the backupEndPoint check into
CheckRecoveryConsistency(). It's already responsible for keeping track of
whether minRecoveryPoint has been reached, so it seems like a good idea to
do this check there as well.

Attached is a patch for that (for 9.2), as well as a script I used to
reproduce the bug. The script is a bit messy, and requires tweaking the
paths at the top. Anyone spot a problem with this?


Yep. The problem is one of design, not merely a coding error.

The design assumes that the master is up, connected and doing work.
Which is perfectly reasonable, since this is the reason we are doing a
backup from the standby. But obviously not always true, so what we're
saying is that the selection of backupEndPoint is incorrect in the
first place. Putting code in to cope with that poor choice at recovery
seems wrong - we should record the correct location.


Hmm, pg_controlinfo says:

pg_control version number:922
Catalog version number:   201204301
Database system identifier:   5823207860608399803
Database cluster state:   in crash recovery
pg_control last modified: ti 18. joulukuuta 2012 11.00.58
Latest checkpoint location:   0/1764EE8
Prior checkpoint location:0/1764EE8
Latest checkpoint's REDO location:0/1764EE8
Latest checkpoint's TimeLineID:   1
...
Minimum recovery ending location: 0/1764F48
Backup start location:0/1764EE8
Backup end location:  0/1764F48
...

That seems correct to me. The backup is considered valid after reaching 
0/1764F48, which is where the checkpoint record ends. minRecoveryPoint 
is set to the same. What do you think it should be set to?



In passing, I see another problem as well. We assume that the backup
is still valid if we promote the standby to a master during the
backup, but that isn't documented and so isn't full analysed to see
that is true.


No, that is taken care of in do_pg_stop_backup:


/*
 * Parse the BACKUP FROM line. If we are taking an online backup from 
the
 * standby, we confirm that the standby has not been promoted during the
 * backup.
 */
ptr = strstr(remaining, BACKUP FROM:);
if (!ptr || sscanf(ptr, BACKUP FROM: %19s\n, backupfrom) != 1)
ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg(invalid data in file \%s\, 
BACKUP_LABEL_FILE)));
if (strcmp(backupfrom, standby) == 0  !backup_started_in_recovery)
ereport(ERROR,

Re: [HACKERS] Error restoring from a base backup taken from standby

2012-12-18 Thread Simon Riggs

On 18 December 2012 09:18, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 That seems correct to me. The backup is considered valid after reaching
 0/1764F48, which is where the checkpoint record ends. minRecoveryPoint is
 set to the same. What do you think it should be set to?

I already said?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parser Cruft in gram.y

2012-12-18 Thread Dimitri Fontaine

Robert Haas robertmh...@gmail.com writes:
 And on the other hand, if you could get a clean split between the two
 grammars, then regardless of exactly what the split was, it might seem
 a win.  But it seemed to me when I looked at this that you'd have to
 duplicate a lot of stuff and the small parser still wouldn't end up
 being very small, which I found hard to get excited about.

I think the goal is not so much about getting a much smaller parser, but
more about have a separate parser that you don't care about the bloat
of, so that you can improve DDL without fearing about main parser
performance regressions.

If we come out with no regression and no gain on the main query parser,
I say it still worth the separation effort. And anyway we only add
things to the main parser (queries) when the standard saith we have to.

Tom Lane t...@sss.pgh.pa.us writes:
 I'm not sure what other tool might be better though.  I looked through
 http://en.wikipedia.org/wiki/Comparison_of_parser_generators#Deterministic_context-free_languages
 but it seems our options are a bit limited if we want something
 that produces C.  It's not clear to me that any of the likely options
 are as mature as bison, let alone likely to substantially outperform it.
 (For instance, Hyacc sounded pretty promising until I got to the part
 about it doesn't yet support %union or %type.)  Still, I didn't spend
 much time on this --- maybe somebody else would like to do a bit more
 research.

I did spend a very little time on it too, with a different search angle,
and did find that experimental thing that might be worth looking at,
or maybe not.

  http://en.wikipedia.org/wiki/Parsing_expression_grammar
  http://piumarta.com/software/peg/

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error restoring from a base backup taken from standby

2012-12-18 Thread Heikki Linnakangas


On 18.12.2012 11:30, Simon Riggs wrote:

On 18 December 2012 09:18, Heikki Linnakangashlinnakan...@vmware.com  wrote:


That seems correct to me. The backup is considered valid after reaching
0/1764F48, which is where the checkpoint record ends. minRecoveryPoint is
set to the same. What do you think it should be set to?


I already said?


In this situation, there are no more WAL records in the standby the 
backup was taken from, after the checkpoint record. Using receivedUpto 
would yield the same value, and you'd still have the same problem.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Simon Riggs

On 18 December 2012 00:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com 
 wrote:
 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

 Why? the timeline history file includes the previous timelines already.

 The original intention was that the WAL archive might contain multiple
 timeline files corresponding to various experimental recovery attempts;
 furthermore, such files might be hand-annotated (that's why there's a
 comment provision).  So they would be at least as valuable from an
 archival standpoint as the WAL files proper.  I think we ought to just
 copy all of them, period.  Anything less is penny-wise and
 pound-foolish.

What I'm saying is that the new history file is created from the old
one, so the latest file includes all the history as a direct copy. The
only thing new is one line of information.

Copying all files grows at O(N^2) with redundancy and will eventually
become a space problem and a performance issue for smaller systems.
There should be some limit to keep this sane, for example, the last 32
history files, or the last 1000 lines of history. Some sane limit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-18 Thread Amit Kapila

On Sunday, December 16, 2012 11:14 PM Tom Lane wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
  On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
  As explained above, I disagree that it looks like a good idea, and
  you've shown no evidence it would be or is true.
 
  Lets separate out these two issues. What you are suggesting as a
  follow up to Tom's idea, I've no objection to that and that might be
  worthwhile optimisation to try out. But this patch itself does not
  attempt to deal with that and its a separate work item and will
  require invasive changes and tests.
 


 Another thing that would need to be considered, if we do want to
 restrict when pruning happens, is whether it is worth introducing some
 other path altogether for setting the all-visible bit.  Or perhaps we
 should modify autovacuum's rules so that it will fire on relations for
 which there might be lots of unmarked all-visible pages.

Can something similar be also used for putting deleted index pages into FSM.

The reason is that currently deleted index pages are recorded in FSM in the
next Vacuum cycle. 
So if after bulk index update (always increasing order), even if the auto
vacuum is done once, it still does not put
deleted index pages into FSM. 
Now let's assume there are no operations which can lead to auto-vacuum on
same table, next cycle of bulk update will allocate
new index pages. 
I had observed this in one of the tests that if bulk index update happens
such that new value is always increasing, then index bloat happens. As per
initial analysis, it seems one of the reasons is what I described above.
If required, I can create a self-containing test which can show that
bulk-index update can lead to index bloat.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] review: pgbench - aggregation of info written into log

2012-12-18 Thread Pavel Stehule

Hello

this patch was proposed
http://archives.postgresql.org/pgsql-hackers/2012-08/msg00817.php and
there was no objections

* there are no issue with patching
* no warnings
* code is clean and respect our source code policy
* tested without errors

My last objections was satisfied - last interval isn't logged, but it
better, because it can be incomplete - and it is better to drop it.

This patch is ready for commit.

Regards

Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLByte* usage

2012-12-18 Thread Dimitri Fontaine

Andres Freund and...@2ndquadrant.com writes:
 In 2) unfortunately one has to make decision in which way to simplify
 negated XLByte(LT|LE) expressions. I tried to make that choice very
 careful and when over every change several times after that, so I hope
 there aren't any bad changes, but more eyeballs are needed.

+   if (lsn  PageGetLSN(page))
+   if (lsn = PageGetLSN(page))
+   if (lsn = PageGetLSN(page))
+   if (max_lsn  this_lsn)

My only comment here would be that I would like it much better that the
code always use the same operators, and as we read from left to right,
that we pick  and =.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLByte* usage

2012-12-18 Thread Andres Freund

On 2012-12-18 13:14:10 +0100, Dimitri Fontaine wrote:
 Andres Freund and...@2ndquadrant.com writes:
  In 2) unfortunately one has to make decision in which way to simplify
  negated XLByte(LT|LE) expressions. I tried to make that choice very
  careful and when over every change several times after that, so I hope
  there aren't any bad changes, but more eyeballs are needed.

 + if (lsn  PageGetLSN(page))
 + if (lsn = PageGetLSN(page))
 + if (lsn = PageGetLSN(page))
 + if (max_lsn  this_lsn)

 My only comment here would be that I would like it much better that the
 code always use the same operators, and as we read from left to right,
 that we pick  and =.

I did that in most places (check the xlog.c changes), but in this case
it didn't seem to be appropriate because because that would have meant
partially reversing the order of operands which would have looked
confusing as well, given this check is a common patter across nearly
every xlog replay function.
Imo its a good idea trying to choose  or = as operator but its also
important to keep the order consistent inside a single function/similar
functions.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Review : Add hooks for pre- and post-processor executables for COPY and \copy

2012-12-18 Thread Amit Kapila

Basic stuff: 
 
- Rebase of Patch is required. 
- Compiles cleanly without any errors/warnings 
- Regression tests pass. 

What it does: 
- 
This patch is useful when COPY command input/output are stored in
compression format or in any command/script uses these output/input in any
means; without generating intermediate temporary files. 
This feature can be used in server side using COPY statement by
administrator. Or can be used in psql internal \copy command by any user. 


Code Review comments: 
- 

1. Modify the comment in function header of: parse_slash_copy (needs to
modify for new syntax)
2. Comments for functions OpenPipeStream  ClosePipeStream are missing.


3. Any Script errors are not directly visible to user; If there problems in
script no way to cleanup.   

   Shouldn't this be mentioned in User Manual.

Test case issues: 
-- 
1. Broken pipe is not handled in case of psql \copy command; 
Issue are as follows: 
Following are verified on SuSE-Linux 10.2. 
1) psql is exiting when \COPY xxx TO command is issued and
command/script is not found 

 When popen is called in write mode it is creating valid
file descriptor and when it tries to write to file Broken pipe error is
coming which is not handled. 
psql# \copy pgbench_accounts TO PROGRAM
'../compress.sh pgbench_accounts4.txt' 
2) When \copy command is in progress then program/command is
killed/crashed due to any problem 
   psql is exiting. 

Script used in testcases: 
-- 
1. compress.sh 
echo 'cat  $1'  compress.sh 
echo 'bzip2 -z $1'  compress.sh 
chmod +x compress.sh 

2. decompress.sh 
echo 'bzip2 -d -c -k $*'  decompress.sh 
chmod +x decompress.sh 



 

Testcases executed are attached with this mail.

 

With Regards,

Amit Kapila.

 

 

--1. check success case admin can use this COPY statement feature.
select count(*) from pgbench_accounts;
copy pgbench_accounts TO PROGRAM '/home/kiran/installation/bin/compress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt';
delete from pgbench_accounts;
copy pgbench_accounts FROM PROGRAM '/home/kiran/installation/bin/decompress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt.bz2';
select count(*) from pgbench_accounts;

--2. check commands are running with relative path to database directory.
copy pgbench_accounts TO PROGRAM '../compress.sh ../pgbench_accounts4.txt';
delete from pgbench_accounts;
copy pgbench_accounts FROM PROGRAM '../decompress.sh 
../pgbench_accounts4.txt.bz2';
select count(*) from pgbench_accounts;

--3. invalid case; where command not found
select count(*) from pgbench_accounts;
copy pgbench_accounts TO PROGRAM 'compress.sh pgbench_accounts1.txt';
copy pgbench_accounts FROM PROGRAM 'decompress.sh pgbench_accounts1.txt.bz2';

--4. invalid case; where file / script having problem.
select count(*) from pgbench_accounts;
copy pgbench_accounts TO PROGRAM '../compress.sh abc/pgbench_accounts1.txt';
copy pgbench_accounts FROM PROGRAM '../decompress.sh 
abc/pgbench_accounts1.txt.bz2';

--repeat all above cases from psql command.
--5. check success case admin can use this \COPY psql feature.
select count(*) from pgbench_accounts;
\copy pgbench_accounts TO PROGRAM '/home/kiran/installation/bin/compress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt';
delete from pgbench_accounts;
\copy pgbench_accounts FROM PROGRAM '/home/kiran/installation/bin/decompress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt.bz2';
select count(*) from pgbench_accounts;

--6. check commands are running with relative path to psql current working 
directory.
\copy pgbench_accounts TO PROGRAM '../compress.sh ../pgbench_accounts4.txt';
delete from pgbench_accounts;
\copy pgbench_accounts FROM PROGRAM '../decompress.sh 
../pgbench_accounts4.txt.bz2';
select count(*) from pgbench_accounts;

--7. invalid case; where command not found
select count(*) from pgbench_accounts;
\copy pgbench_accounts TO PROGRAM 'compress.sh pgbench_accounts1.txt';
\copy pgbench_accounts FROM PROGRAM 'decompress.sh pgbench_accounts1.txt.bz2';

--8. invalid case; where file / script having problem.
select count(*) from pgbench_accounts;
\copy pgbench_accounts TO PROGRAM '../compress.sh abc/pgbench_accounts1.txt';
\copy pgbench_accounts FROM PROGRAM '../decompress.sh 
abc/pgbench_accounts1.txt.bz2';

--9. Normal user can't execute COPY statement
create user abc;
\c abc
select count(*) from pgbench_accounts;
copy pgbench_accounts TO PROGRAM '/home/kiran/installation/bin/compress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt';
delete from pgbench_accounts;
copy pgbench_accounts FROM PROGRAM '/home/kiran/installation/bin/decompress.sh 
/home/kiran/installation/bin/pgbench_accounts1.txt.bz2';
select count(*) from pgbench_accounts;

--10. Normal user can execute \COPY psql

Re: [HACKERS] configure.in and setproctitle/optreset problem

2012-12-18 Thread Christoph Berg

Re: Tom Lane 2012-12-18 26465.1355798...@sss.pgh.pa.us
 I think we should assume that the libedit developers are utterly
 clueless about not trampling on application namespace, and just cut
 that library out of *all* our link checks except for the symbols we
 specifically expect to get from libedit/libreadline.  Hence, I propose
 the attached slightly more extensive patch.  I've verified that this
 produces a working build with Debian's libedit.

I confirm that your patch fixes the problem on master and
REL9_2_STABLE (manually applied, one hunk failed). Thanks!

Christoph
-- 
Tel.: +49 (0)21 61 / 46 43-187
credativ GmbH, HRB Mönchengladbach 12080
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Kevin Grittner

Greg Smith wrote:

 In general, what I hope people will be able to do is switch over to 
 their standby server, and then investigate further. I think it's 
 unlikely that people willing to pay for block checksums will only have 
 one server. Having some way to nail down if the same block is bad on a 
 given standby seems like a useful interface we should offer, and it 
 shouldn't take too much work. Ideally you won't find the same 
 corruption there. I'd like a way to check the entirety of a standby for 
 checksum issues, ideally run right after it becomes current. It seems 
 the most likely way to see corruption on one of those is to replicate a 
 corrupt block.
 
 There is no good way to make the poor soul who has no standby server 
 happy here. You're just choosing between bad alternatives. The first 
 block error is often just that--the first one, to be joined by others 
 soon afterward. My experience at how drives fail says the second error 
 is a lot more likely after you've seen one.

+1 on all of that.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] configure.in and setproctitle/optreset problem

2012-12-18 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On Mon, 2012-12-17 at 18:02 +0100, Christoph Berg wrote:
 I have no clue why no one else has seen this bug before, but the
 reason for the error seems to be that configure is invoking the
 setproctitle test including -ledit. libedit.so is linked to libbsd.so,
 which in turn contains setproctitle(), so HAVE_SETPROCTITLE is set
 even this is Linux, not BSD. 

 I reported this here:
 http://archives.postgresql.org/pgsql-bugs/2012-07/msg00127.php

I thought the issue sounded familiar ...

 The correct fix, IMO/IIRC, is to add LDFLAGS=-Wl,--as-needed before
 running most of the configure checks, instead of after.

Meh.  It's not clear to me at all that that fixes the issue here,
or at least that it does so in any way that's reliable.  The proposal
to add --as-needed during configure was made to fix a different problem,
namely making the wrong decision about whether libintl needs to be
pulled in explicitly.  We don't seem to have done anything about that
--- personally I was afraid that it could break as many cases as it
fixed.  Reviewing that thread, I still think that swapping the order
of the probes for libintl and libgssapi is the safest fix for the
previous issue.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Greg Stark

 There is no good way to make the poor soul who has no standby server
 happy here. You're just choosing between bad alternatives. The first
 block error is often just that--the first one, to be joined by others
 soon afterward. My experience at how drives fail says the second error
 is a lot more likely after you've seen one.

For what it's worth Oracle allows you to recover a specific block from
backups including replaying the archive logs for that one block.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Bernhard Schrader


On 12/18/2012 02:41 AM, Bruce Momjian wrote:

   On Mon, Dec 17, 2012 at 12:14:29PM +0100, Bernhard Schrader wrote:

   Hello together,

   last thursday I upgraded one of our 9.0.6 postgresql servers to
   9.2.2 with pg_upgrade. So far everything seemed to work but we now
   discover problems with the enum types. If we run one specific query
   it breaks all time with such an error message:

   ERROR: invalid internal value for enum: 520251

   if this number should represent the enumtypid it is not existing
   anymore in pg_enum.

   How could i solve this problem? should we regenerate all enums? or
   what could we do?
   Hopefully anyone has a clue, google doesn't seem to be the ressource
   for this problem.

   We seriously tested the enum code so I am pretty confused why this is
   failing.  If you do pg_dump --binary-upgrade --schema-only, do you see
   that a number like this being defined just before the enum is added?

Hi Bruce,

if i am dumping this db and grepping through the dump, i can't find the 
number.


As far as we can see, the enum that is affected has now the enumtypid 16728.

is there a table which keeps the possible typecasts from enum to 
text/text to enum etc.? if so, maybe the mapping in here is corrupt 
since the upgrade.


regards


###

Hi again,

maybe there are more information needed to point this stuff out. I'm not 
quite sure what would be useful, so i just give you that stuff where is 
stumpled upon.


1.) We have some staging servers, where i first used pg_upgrade to make 
sure everything is running and nothing breaks on our beta/live servers. 
And it worked there, without any problem i can use the enums which break 
on the beta servers


2.) As mentioned, on beta servers the usage of the enum fails with error 
message:


ERROR: invalid internal value for enum: 520251


3.) If i search for the enumtypid or oid in pg_enum, it is obviously not 
there.


select * from pg_enum where enumtypid=520251;
(No rows)

select * from pg_enum where oid=520251;
(No rows)

4.) If i am searching for the enumlabels which are used by the query i 
am getting as enumtypid 16728 which also belongs to the expected pg_type


5.) pg_enum of the enumtypid looks like this

select oid,* from pg_enum where enumtypid=16728;

  oid   | enumtypid | enumsortorder | enumlabel
+---+---+---
  16729 | 16728 | 1 | att
  16730 | 16728 | 2 | def
  16731 | 16728 | 3 | all
 646725 | 16728 | 4 | adm_att
 646726 | 16728 | 5 | adm_def

6.) enumlabels adm_att and adm_def are also defined under another 
enumtypid, but i think this shouldn't affect anything. just wanted to 
mention this.


7.) during pg_upgrade i used --link method

Well, if you need any other info please ask. i just can't imagine why 
this stuff worked on staging servers but not on beta, as they are 
identical on database point of view.



--
Bernhard Schrader
System Administration

InnoGames GmbH
Harburger Schloßstraße 28 (Channel 4) - 21079 Hamburg - Germany
Tel +49 40 7889335-53
Fax +49 40 7889335-22

Managing Directors: Hendrik Klindworth, Eike Klindworth, Michael Zillmer
VAT-ID: DE264068907 Amtsgericht Hamburg, HRB 108973

http://www.innogames.com -- bernhard.schra...@innogames.de

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Andrew Dunstan



On 12/18/2012 03:45 AM, Bernhard Schrader wrote:

On 12/18/2012 02:41 AM, Bruce Momjian wrote:

On Mon, Dec 17, 2012 at 12:14:29PM +0100, Bernhard Schrader wrote:

Hello together,

last thursday I upgraded one of our 9.0.6 postgresql servers to
9.2.2 with pg_upgrade. So far everything seemed to work but we now
discover problems with the enum types. If we run one specific query
it breaks all time with such an error message:

ERROR: invalid internal value for enum: 520251

if this number should represent the enumtypid it is not existing
anymore in pg_enum.

How could i solve this problem? should we regenerate all enums? or
what could we do?
Hopefully anyone has a clue, google doesn't seem to be the ressource
for this problem.

We seriously tested the enum code so I am pretty confused why this is
failing.  If you do pg_dump --binary-upgrade --schema-only, do you see
that a number like this being defined just before the enum is added?


Hi Bruce,

if i am dumping this db and grepping through the dump, i can't find 
the number.
As far as we can see, the enum that is affected has now the enumtypid 
16728.


is there a table which keeps the possible typecasts from enum to 
text/text to enum etc.? if so, maybe the mapping in here is corrupt 
since the upgrade.





The translations from oid to label are in pg_enum, but it looks like 
somehow you have lost that mapping. I'm not sure what you've done but 
AFAICT pg_upgrade is doing the right thing.


I just did this (from 9.0 to 9.2) and the pg_upgrade_dump_all.sql that 
is used to create the new catalog has these lines:


   -- For binary upgrade, must preserve pg_type oid
   SELECT binary_upgrade.set_next_pg_type_oid('40804'::pg_catalog.oid);


   -- For binary upgrade, must preserve pg_type array oid
   SELECT
   binary_upgrade.set_next_array_pg_type_oid('40803'::pg_catalog.oid);

   CREATE TYPE myenum AS ENUM (
   );

   -- For binary upgrade, must preserve pg_enum oids
   SELECT binary_upgrade.set_next_pg_enum_oid('40805'::pg_catalog.oid);
   ALTER TYPE public.myenum ADD VALUE 'foo';

   SELECT binary_upgrade.set_next_pg_enum_oid('40806'::pg_catalog.oid);
   ALTER TYPE public.myenum ADD VALUE 'bar';

   SELECT binary_upgrade.set_next_pg_enum_oid('40807'::pg_catalog.oid);
   ALTER TYPE public.myenum ADD VALUE 'baz';

and this worked exactly as expected, with a table using this type 
showing the expected values.


Can you produce a test case demonstrating the error?

When  you run pg_upgrade, use the -r flag to keep all the intermediate 
files so we can see what's going on.


It's no good dumping the new db looking for these values if they have 
been lost. You would need to have a physical copy of the old db and dump 
that in binary upgrade mode looking for the Oid. If you don't have a 
physical copy of the old db or the intermediate dump file pg_upgrade 
used then recovery is going to be pretty difficult. It's not necessarily 
impossible, but it might involve you getting some outside help.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Bruce Momjian

On Tue, Dec 18, 2012 at 10:52:46AM -0500, Andrew Dunstan wrote:
 The translations from oid to label are in pg_enum, but it looks like
 somehow you have lost that mapping. I'm not sure what you've done
 but AFAICT pg_upgrade is doing the right thing.
 
 I just did this (from 9.0 to 9.2) and the pg_upgrade_dump_all.sql
 that is used to create the new catalog has these lines:
 
-- For binary upgrade, must preserve pg_type oid
SELECT binary_upgrade.set_next_pg_type_oid('40804'::pg_catalog.oid);
 
 
-- For binary upgrade, must preserve pg_type array oid
SELECT
binary_upgrade.set_next_array_pg_type_oid('40803'::pg_catalog.oid);
 
CREATE TYPE myenum AS ENUM (
);
 
-- For binary upgrade, must preserve pg_enum oids
SELECT binary_upgrade.set_next_pg_enum_oid('40805'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'foo';
 
SELECT binary_upgrade.set_next_pg_enum_oid('40806'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'bar';
 
SELECT binary_upgrade.set_next_pg_enum_oid('40807'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'baz';
 
 and this worked exactly as expected, with a table using this type
 showing the expected values.
 
 Can you produce a test case demonstrating the error?
 
 When  you run pg_upgrade, use the -r flag to keep all the
 intermediate files so we can see what's going on.
 
 It's no good dumping the new db looking for these values if they
 have been lost. You would need to have a physical copy of the old db
 and dump that in binary upgrade mode looking for the Oid. If you
 don't have a physical copy of the old db or the intermediate dump
 file pg_upgrade used then recovery is going to be pretty difficult.
 It's not necessarily impossible, but it might involve you getting
 some outside help.

Yes, this matches what I thought too.  You see the
binary_upgrade.set_next_pg_enum_oid() calls in pg_dump --binary-upgrade
--schema-only and those set the oid of the newly created enum.

I agree you would need to run this on the _old_ cluster for us to figure
out how it failed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with pg_upgrade 9.0 - 9.1

2012-12-18 Thread Bruce Momjian

On Tue, Dec 18, 2012 at 09:28:00AM +0400, Groshev Andrey wrote:
 
 
 18.12.2012, 05:22, Bruce Momjian br...@momjian.us:
  This is the first pg_upgrade mismatch report we have gotten about 9.1.
  I have asked the reporter for details.
 
  Is what is the full 9.1 version number?
 
  ---
 
   # rpm -qa |grep postgres
   postgresql90-devel-9.0.11-1PGDG.rhel6.x86_64
   postgresql91-9.1.7-1PGDG.rhel6.x86_64
   postgresql90-9.0.11-1PGDG.rhel6.x86_64
   postgresql90-server-9.0.11-1PGDG.rhel6.x86_64
   postgresql91-libs-9.1.7-1PGDG.rhel6.x86_64
   postgresql91-server-9.1.7-1PGDG.rhel6.x86_64
   postgresql91-devel-9.1.7-1PGDG.rhel6.x86_64
   postgresql90-libs-9.0.11-1PGDG.rhel6.x86_64
   postgresql90-contrib-9.0.11-1PGDG.rhel6.x86_64
   postgresql91-contrib-9.1.7-1PGDG.rhel6.x86_64
 
 
 Full version ? It is not full postgresql91-9.1.7-1PGDG.rhel6.x86_64 or I do 
 not understand something?
 I installed latest postgresql from the repository http://yum.pgrpms.org

Oops, I see that now, sorry.  I wanted to make sure you were on the most
recent 9.1 version, and you are.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] writable foreign tables

2012-12-18 Thread Ronan Dunklau

Hello.

I've tried to implement this API for our Multicorn FDW, and I have a few
questions about the API.

First, I don't understand what are the requirements on the rowid
pseudo-column values.

In particular, this sentence from a previous mail makes it ambiguous to me:

 At the beginning, I constructed the rowid pseudo-column using
 INTERNALOID, but it made troubles when fetched tuples are
 materialized prior to table modification.
 So, the rowid argument of them are re-defined as const char *.

Does that mean that one should only store a cstring in the rowid
pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
building a text value using cstring_to_text_with_len. Could there be a
problem with that ?

Secondly, how does one use a returning clause ?
I've tried to look at the postgres_fdw code, but it does not seem to handle
that.

For what its worth, knowing that the postgres_fdw is still in WIP status,
here is a couple result of my experimentation with it:

- Insert into a foreign table mapped to a table with a before trigger,
using a returning clause, will return the values as they were before being
modified by the trigger.
- Same thing,  but if the trigger prevents insertion (returning NULL), the
would-have-been inserted row is still returned, although the insert
statement reports zero rows.
- Inserting into a table with default values does not work as intended,
since missing values are replaced by a null value in the remote statement.

What can be done to make the behaviour more consistent ?

I'm very excited about this feature, thank you for making this possible.

Regards,
--
Ronan Dunklau

2012/12/14 Albe Laurenz laurenz.a...@wien.gv.at

 Kohei KaiGai wrote:
  I came up with one more query that causes a problem:
 [...]
  This causes a deadlock, but one that is not detected;
  the query just keeps hanging.
 
  The UPDATE in the CTE has the rows locked, so the
  SELECT ... FOR UPDATE issued via the FDW connection will hang
  indefinitely.
 
  I wonder if that's just a pathological corner case that we shouldn't
  worry about.  Loopback connections for FDWs themselves might not
  be so rare, for example as a substitute for autonomous subtransactions.
 
  I guess it is not easily possible to detect such a situation or
  to do something reasonable about it.
 
  It is not avoidable problem due to the nature of distributed database
 system,
  not only loopback scenario.
 
  In my personal opinion, I'd like to wait for someone implements
 distributed
  lock/transaction manager on top of the background worker framework being
  recently committed, to intermediate lock request.
  However, it will take massive amount of efforts to existing
 lock/transaction
  management layer, not only enhancement of FDW APIs. It is obviously out
  of scope in this patch.
 
  So, I'd like to suggest authors of FDW that support writable features to
 put
  mention about possible deadlock scenario in their documentation.
  At least, above writable CTE example is a situation that two different
 sessions
  concurrently update the test relation, thus, one shall be blocked.

 Fair enough.

  I tried to overhaul the documentation, see the attached patch.
 
  There was one thing that I was not certain of:
  You say that for writable foreign tables, BeginForeignModify
  and EndForeignModify *must* be implemented.
  I thought that these were optional, and if you can do your work
  with just, say, ExecForeignDelete, you could do that.
 
  Yes, that's right. What I wrote was incorrect.
  If FDW driver does not require any state during modification of
  foreign tables, indeed, these are not mandatory handler.

 I have updated the documentation, that was all I changed in the
 attached patches.

  OK. I split the patch into two portion, part-1 is the APIs relevant
  patch, part-2 is relevant to postgres_fdw patch.

 Great.

 I'll mark the patch as ready for committer.

 Yours,
 Laurenz Albe


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with pg_upgrade 9.0 - 9.1

2012-12-18 Thread Bruce Momjian

On Mon, Dec 17, 2012 at 09:21:59PM -0500, Bruce Momjian wrote:
  Mismatch of relation names: database database, old rel 
  public.lob.ВерсияВнешнегоДокумента$Документ_pkey, new rel 
  public.plob.ВерсияВнешнегоДокумент
  а$Документ
  Failure, exiting

I am now confused over the error message above.  This is the code that
is generating the error:

/*
 * TOAST table names initially match the heap pg_class oid.
 * In pre-8.4, TOAST table names change during CLUSTER; in pre-9.0,
 * TOAST table names change during ALTER TABLE ALTER COLUMN SET TYPE.
 * In = 9.0, TOAST relation names always use heap table oids, hence
 * we cannot check relation names when upgrading from pre-9.0.
 * Clusters upgraded to 9.0 will get matching TOAST names.
 */
if (strcmp(old_rel-nspname, new_rel-nspname) != 0 ||
((GET_MAJOR_VERSION(old_cluster.major_version) = 900 ||
  strcmp(old_rel-nspname, pg_toast) != 0) 
 strcmp(old_rel-relname, new_rel-relname) != 0))
pg_log(PG_FATAL, Mismatch of relation names: database \%s\, 
   old rel %s.%s, new rel %s.%s\n,
   old_db-db_name, old_rel-nspname, old_rel-relname,
   new_rel-nspname, new_rel-relname);

Looking at the Russian, I see 'old rel' public.lob.* and 'new rel'
public.plob.*.  I assume the database is called 'database', and the
schema is called 'public', but what is 'lob' and 'plob'?  If those are
tables or indexes, what is after the period?  Do you have periods
embedded in the table/index names?  That is certainly possible, but not
common, e.g.:

test= create table test.x (y int);
CREATE TABLE

Is the schema called public.lob?  I expected to see schema.objname.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Peter Eisentraut

There are some system administration functions that have hardcoded
superuser checks, specifically:

pg_reload_conf
pg_rotate_logfile
pg_read_file
pg_read_file_all
pg_read_binary_file
pg_read_binary_file_all
pg_stat_file
pg_ls_dir

Some of these are useful in monitoring or maintenance tools, and the
hardcoded superuser checks require that these tools run with maximum
privileges.  Couldn't we just install these functions without default
privileges and allow users to grant privileges as necessary?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Simon Riggs

On 18 December 2012 17:09, Peter Eisentraut pete...@gmx.net wrote:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:

 pg_reload_conf
 pg_rotate_logfile
 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir

 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?

+1

But I would include recovery functions also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with pg_upgrade 9.0 - 9.1

2012-12-18 Thread Groshev Andrey

later in the log pg_dump, I found the definition of new rel

--
-- Name: plob.ВерсияВнешнегоДокумента$Документ; Type: CONSTRAINT; Schema: 
public; Owner: postgres; Tablespace:
--

ALTER TABLE ONLY lob.ВерсияВнешнегоДокумента$Документ
ADD CONSTRAINT plob.ВерсияВнешнегоДокумента$Документ PRIMARY KEY 
(@Файл, Страница);




18.12.2012, 19:38, Bruce Momjian br...@momjian.us:
 On Mon, Dec 17, 2012 at 09:21:59PM -0500, Bruce Momjian wrote:

  Mismatch of relation names: database database, old rel 
 public.lob.ВерсияВнешнегоДокумента$Документ_pkey, new rel 
 public.plob.ВерсияВнешнегоДокумента$Документ
  Failure, exiting

 I am now confused over the error message above.  This is the code that
 is generating the error:

 /*
  * TOAST table names initially match the heap pg_class oid.
  * In pre-8.4, TOAST table names change during CLUSTER; in pre-9.0,
  * TOAST table names change during ALTER TABLE ALTER COLUMN SET TYPE.
  * In = 9.0, TOAST relation names always use heap table oids, hence
  * we cannot check relation names when upgrading from pre-9.0.
  * Clusters upgraded to 9.0 will get matching TOAST names.
  */
 if (strcmp(old_rel-nspname, new_rel-nspname) != 0 ||
 ((GET_MAJOR_VERSION(old_cluster.major_version) = 900 ||
   strcmp(old_rel-nspname, pg_toast) != 0) 
  strcmp(old_rel-relname, new_rel-relname) != 0))
 pg_log(PG_FATAL, Mismatch of relation names: database \%s\, 
    old rel %s.%s, new rel %s.%s\n,
    old_db-db_name, old_rel-nspname, old_rel-relname,
    new_rel-nspname, new_rel-relname);

 Looking at the Russian, I see 'old rel' public.lob.* and 'new rel'
 public.plob.*.  I assume the database is called 'database', and the
 schema is called 'public', but what is 'lob' and 'plob'?  If those are
 tables or indexes, what is after the period?  Do you have periods
 embedded in the table/index names?  That is certainly possible, but not
 common, e.g.:

 test= create table test.x (y int);
 CREATE TABLE

 Is the schema called public.lob?  I expected to see schema.objname.

 --
   Bruce Momjian  br...@momjian.us    http://momjian.us
   EnterpriseDB http://enterprisedb.com

   + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Pavel Stehule

2012/12/18 Peter Eisentraut pete...@gmx.net:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:

 pg_reload_conf
 pg_rotate_logfile

 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?

isn't it too strong gun for some people ???

I believe so some one can decrease necessary rights and it opens doors
to system.

 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir

is relative dangerous and I am not for opening these functions.

power user can simply to write extension, but he knows what he does/

Regards

Pavel





 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Fujii Masao

On Tue, Dec 18, 2012 at 8:09 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 18 December 2012 00:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com 
 wrote:
 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

 Why? the timeline history file includes the previous timelines already.

 The original intention was that the WAL archive might contain multiple
 timeline files corresponding to various experimental recovery attempts;
 furthermore, such files might be hand-annotated (that's why there's a
 comment provision).  So they would be at least as valuable from an
 archival standpoint as the WAL files proper.  I think we ought to just
 copy all of them, period.  Anything less is penny-wise and
 pound-foolish.

 What I'm saying is that the new history file is created from the old
 one, so the latest file includes all the history as a direct copy. The
 only thing new is one line of information.

The timeline history file includes only ancestor timelines history. So
the latest one might not include all the history.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Bernhard Schrader


On 12/18/2012 05:22 PM, Bruce Momjian wrote:

On Tue, Dec 18, 2012 at 10:52:46AM -0500, Andrew Dunstan wrote:

The translations from oid to label are in pg_enum, but it looks like
somehow you have lost that mapping. I'm not sure what you've done
but AFAICT pg_upgrade is doing the right thing.

I just did this (from 9.0 to 9.2) and the pg_upgrade_dump_all.sql
that is used to create the new catalog has these lines:

-- For binary upgrade, must preserve pg_type oid
SELECT binary_upgrade.set_next_pg_type_oid('40804'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type array oid
SELECT
binary_upgrade.set_next_array_pg_type_oid('40803'::pg_catalog.oid);

CREATE TYPE myenum AS ENUM (
);

-- For binary upgrade, must preserve pg_enum oids
SELECT binary_upgrade.set_next_pg_enum_oid('40805'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'foo';

SELECT binary_upgrade.set_next_pg_enum_oid('40806'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'bar';

SELECT binary_upgrade.set_next_pg_enum_oid('40807'::pg_catalog.oid);
ALTER TYPE public.myenum ADD VALUE 'baz';

and this worked exactly as expected, with a table using this type
showing the expected values.

Can you produce a test case demonstrating the error?

When  you run pg_upgrade, use the -r flag to keep all the
intermediate files so we can see what's going on.

It's no good dumping the new db looking for these values if they
have been lost. You would need to have a physical copy of the old db
and dump that in binary upgrade mode looking for the Oid. If you
don't have a physical copy of the old db or the intermediate dump
file pg_upgrade used then recovery is going to be pretty difficult.
It's not necessarily impossible, but it might involve you getting
some outside help.

Yes, this matches what I thought too.  You see the
binary_upgrade.set_next_pg_enum_oid() calls in pg_dump --binary-upgrade
--schema-only and those set the oid of the newly created enum.

I agree you would need to run this on the _old_ cluster for us to figure
out how it failed.


Hey,
i just made a testrun, i restored a dump to a testmachine with 9.0 
running, made a pg_dump --binary-upgrade --schema-only of that, made my 
upgrade to 9.2, after that i checked the schema dump and the values of 
the enumtypid in the 9.2 database and they were identically. Thats how 
it is expected to be.


Nevertheless this didn't worked with the beta server. but i have no dump 
to prove this. Beside the fact that i want to fix my db's, i would also 
like to help to improve the upgrade process, but i have no clue right 
now how i could do this. i think i will try some other dbs to check if 
there maybe an error occurs.


Beside of that, we tested a little bit more with the failing query:
The statement which is causing the error is a big UPDATE-statement with 
FROM. After some testing we figured out that the subselect in the 
FROM-clause is working fine. And if we simplify the UPDATE-statement 
it's also working. We're able to show the data and we're able to do 
simple updates on the table. But the two things combined are not 
working. We checked the data from the subselect - it's correct. In the 
FROM-clause we're using a window-function to calculate a ranking. Do you 
know, if there is any mapping for window-functions which has to deal 
with enums?


regards

--
Bernhard Schrader
System Administration

InnoGames GmbH
Harburger Schloßstraße 28 (Channel 4) - 21079 Hamburg - Germany
Tel +49 40 7889335-53
Fax +49 40 7889335-22

Managing Directors: Hendrik Klindworth, Eike Klindworth, Michael Zillmer
VAT-ID: DE264068907 Amtsgericht Hamburg, HRB 108973

http://www.innogames.com – bernhard.schra...@innogames.de




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error restoring from a base backup taken from standby

2012-12-18 Thread Fujii Masao

On Tue, Dec 18, 2012 at 2:39 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 (This is different from the other issue related to timeline switches I just
 posted about. There's no timeline switch involved in this one.)

 If you do pg_basebackup -x against a standby server, in some circumstances
 the backup fails to restore with error like this:

 C 2012-12-17 19:09:44.042 EET 7832 LOG:  database system was not properly
 shut down; automatic recovery in progress
 C 2012-12-17 19:09:44.091 EET 7832 LOG:  record with zero length at
 0/1764F48
 C 2012-12-17 19:09:44.091 EET 7832 LOG:  redo is not required
 C 2012-12-17 19:09:44.091 EET 7832 FATAL:  WAL ends before end of online
 backup
 C 2012-12-17 19:09:44.091 EET 7832 HINT:  All WAL generated while online
 backup was taken must be available at recovery.
 C 2012-12-17 19:09:44.092 EET 7831 LOG:  startup process (PID 7832) exited
 with exit code 1
 C 2012-12-17 19:09:44.092 EET 7831 LOG:  aborting startup due to startup
 process failure

 I spotted this bug while reading the code, and it took me quite a while to
 actually construct a test case to reproduce the bug, so let me begin by
 discussing the code where the bug is. You get the above error, WAL ends
 before end of online backup, when you reach the end of WAL before reaching
 the backupEndPoint stored in the control file, which originally comes from
 the backup_label file. backupEndPoint is only used in a base backup taken
 from a standby, in a base backup taken from the master, the end-of-backup
 WAL record is used instead to mark the end of backup. In the xlog redo loop,
 after replaying each record, we check if we've just reached backupEndPoint,
 and clear it from the control file if we have. Now the problem is, if there
 are no WAL records after the checkpoint redo point, we never even enter the
 redo loop, so backupEndPoint is not cleared even though it's reached
 immediately after reading the initial checkpoint record.

Good catch!

 To deal with the similar situation wrt. reaching consistency for hot standby
 purposes, we call CheckRecoveryConsistency() before the redo loop. The
 straightforward fix is to copy-paste the check for backupEndPoint to just
 before the redo loop, next to the CheckRecoveryConsistency() call. Even
 better, I think we should move the backupEndPoint check into
 CheckRecoveryConsistency(). It's already responsible for keeping track of
 whether minRecoveryPoint has been reached, so it seems like a good idea to
 do this check there as well.

 Attached is a patch for that (for 9.2), as well as a script I used to
 reproduce the bug.

The patch looks good to me.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Tom Lane

Fujii Masao masao.fu...@gmail.com writes:
 On Tue, Dec 18, 2012 at 8:09 PM, Simon Riggs si...@2ndquadrant.com wrote:
 What I'm saying is that the new history file is created from the old
 one, so the latest file includes all the history as a direct copy. The
 only thing new is one line of information.

 The timeline history file includes only ancestor timelines history. So
 the latest one might not include all the history.

Indeed.  And even if there are a thousand of them, so what?  It'd still
be less space than a single WAL segment file.

Better to keep the data than wish we had it later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Tom Lane

Bernhard Schrader bernhard.schra...@innogames.de writes:
 Beside of that, we tested a little bit more with the failing query:
 The statement which is causing the error is a big UPDATE-statement with 
 FROM. After some testing we figured out that the subselect in the 
 FROM-clause is working fine. And if we simplify the UPDATE-statement 
 it's also working. We're able to show the data and we're able to do 
 simple updates on the table. But the two things combined are not 
 working.

Does the table being updated have any indexes on enum columns?  I'm
suspicious that the bogus OID is in an index page somewhere, and not
in the table at all.

If that is the answer, then reindexing said index would get rid of
the problem (as well as all evidence that would help us find out how
it happened ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Andrew Dunstan



On 12/18/2012 01:24 PM, Tom Lane wrote:

Bernhard Schrader bernhard.schra...@innogames.de writes:

Beside of that, we tested a little bit more with the failing query:
The statement which is causing the error is a big UPDATE-statement with
FROM. After some testing we figured out that the subselect in the
FROM-clause is working fine. And if we simplify the UPDATE-statement
it's also working. We're able to show the data and we're able to do
simple updates on the table. But the two things combined are not
working.

Does the table being updated have any indexes on enum columns?  I'm
suspicious that the bogus OID is in an index page somewhere, and not
in the table at all.

If that is the answer, then reindexing said index would get rid of
the problem (as well as all evidence that would help us find out how
it happened ...)




Unless they can make a physical backup of the datadir first.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Andres Freund

On 2012-12-18 13:24:12 -0500, Tom Lane wrote:
 Bernhard Schrader bernhard.schra...@innogames.de writes:
  Beside of that, we tested a little bit more with the failing query:
  The statement which is causing the error is a big UPDATE-statement with
  FROM. After some testing we figured out that the subselect in the
  FROM-clause is working fine. And if we simplify the UPDATE-statement
  it's also working. We're able to show the data and we're able to do
  simple updates on the table. But the two things combined are not
  working.

 Does the table being updated have any indexes on enum columns?  I'm
 suspicious that the bogus OID is in an index page somewhere, and not
 in the table at all.

I already wondered whether it could be a problem caused by pg_upgrade
not ignoring invalid indexes until recently, but I don't really see how
it could cause an invalid oid to end up in the index.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2012-12-18 13:24:12 -0500, Tom Lane wrote:
 Does the table being updated have any indexes on enum columns?  I'm
 suspicious that the bogus OID is in an index page somewhere, and not
 in the table at all.

 I already wondered whether it could be a problem caused by pg_upgrade
 not ignoring invalid indexes until recently, but I don't really see how
 it could cause an invalid oid to end up in the index.

It seems like this might indicate a flaw in our scheme for preventing
uncommitted enum values from getting into tables/indexes.  Hard to see
what though.

Bernhard, if you do identify a particular index as being the source of
the failure, that would at least tell us for sure which enum type is
at fault.  I don't suppose you would have any info about the history
of that enum type in your database?  The fact that the OID is odd
implies that it belonged to a value that was added by ALTER TYPE ADD
VALUE, but what we'd want is some context around any past uses of
that command, especially if they failed or were rolled back.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Andrew Dunstan



On 12/18/2012 02:34 PM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2012-12-18 13:24:12 -0500, Tom Lane wrote:

Does the table being updated have any indexes on enum columns?  I'm
suspicious that the bogus OID is in an index page somewhere, and not
in the table at all.

I already wondered whether it could be a problem caused by pg_upgrade
not ignoring invalid indexes until recently, but I don't really see how
it could cause an invalid oid to end up in the index.

It seems like this might indicate a flaw in our scheme for preventing
uncommitted enum values from getting into tables/indexes.  Hard to see
what though.

Bernhard, if you do identify a particular index as being the source of
the failure, that would at least tell us for sure which enum type is
at fault.  I don't suppose you would have any info about the history
of that enum type in your database?  The fact that the OID is odd
implies that it belonged to a value that was added by ALTER TYPE ADD
VALUE, but what we'd want is some context around any past uses of
that command, especially if they failed or were rolled back.





He's upgrading from 9.0, which didn't have enum extension at all, and 
where odd enums didn't mean anything special.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 He's upgrading from 9.0, which didn't have enum extension at all, and 
 where odd enums didn't mean anything special.

Really?  The noncontiguous pg_enum OIDs shown in
http://archives.postgresql.org/pgsql-hackers/2012-12/msg01089.php
suggest strongly that *something's* been done to that type since
it was created.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Andrew Dunstan



On 12/18/2012 02:58 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

He's upgrading from 9.0, which didn't have enum extension at all, and
where odd enums didn't mean anything special.

Really?  The noncontiguous pg_enum OIDs shown in
http://archives.postgresql.org/pgsql-hackers/2012-12/msg01089.php
suggest strongly that *something's* been done to that type since
it was created.



That's what he said.

People have been known to hack pg_enum on their own, especially before 
we added enum extension.


Of course, if they do that they get to keep both pieces.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - GetOldestXmin

2012-12-18 Thread Robert Haas

On Fri, Dec 14, 2012 at 7:19 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2012-12-14 14:01:30 -0500, Robert Haas wrote:
 On Fri, Dec 14, 2012 at 6:46 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Just moving that tidbit inside the lock seems to be the pragmatic
  choice. GetOldestXmin is called
 
  * once per checkpoint
  * one per index build
  * once in analyze
  * twice per vacuum
  * once for HS feedback messages
 
  Nothing of that occurs frequently enough that 5 instructions will make a
  difference. I would be happy to go an alternative path, but right now I
  don't see any nice one. A already_locked parameter to GetOldestXmin
  seems to be a cure worse than the disease.

 I'm not sure that would be so bad, but I guess I question the need to
 do it this way at all.  Most of the time, if you need to advertise
 your global xmin, you use GetSnapshotData(), not GetOldestXmin(), and
 I guess I'm not seeing why that wouldn't also work here.  Am I dumb?

 I wondered upthread whether that would be better:

 On 2012-12-13 21:03:44 +0100, Andres Freund wrote:
 Another alternative to this would be to get a snapshot with
 GetSnapshotData(), copy the xmin to the logical slot, then call
 ProcArrayEndTransaction(). But that doesn't really seem to be nicer to
 me.

 Not sure why I considered it ugly anymore, but it actually has a
 noticeable disadvantage. GetOldestXmin is nicer is than GetSnapshotData
 as the latter set a fairly new xid as xmin whereas GetOldestXmin returns
 the actual current xmin horizon. Thats preferrable because it allows us
 to start up more quickly. snapbuild.c can only start building a snapshot
 once it has seen a xl_running_xact with oldestRunningXid =
 own_xmin. Otherwise we cannot be sure that no relevant catalog tuples
 have been removed.

I'm a bit confused.  Are you talking about the difference between
RecentGlobalXmin and RecentXmin?  I think GetSnapshotData() updates
both.

Anyway, if there's no nicer way, I think it's probably OK to add a
parameter to GetOldestXmin().  It seems like kind of a hack, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 People have been known to hack pg_enum on their own, especially before 
 we added enum extension.
 Of course, if they do that they get to keep both pieces.

Yeah ... this would be very readily explainable if there had been a
manual deletion from pg_enum somewhere along the line.  Even if there
were at that time no instances of the OID left in tables, there could
be some in upper btree pages.  They'd have caused no trouble in 9.0
but would (if odd) cause trouble in 9.2.

Of course, this theory doesn't explain why the problem was seen on some
copies and not others cloned from the same database --- unless maybe
there had been an index page split on the master in between the
clonings, and that moved the troublesome OID into a position where it
was more likely to get compared-to.  That's not a hugely convincing
explanation though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Jeff Davis

On Tue, 2012-12-18 at 08:17 +, Simon Riggs wrote:
 I think we should discuss whether we accept my premise? Checksums will
 actually detect more errors than we see now, and people will want to
 do something about that. Returning to backup is one way of handling
 it, but on a busy production system with pressure on, there is
 incentive to implement a workaround, not a fix. It's not an easy call
 to say we've got 3 corrupt blocks, so I'm going to take the whole
 system offline while I restore from backup.

Up until now, my assumption has generally been that, upon finding the
corruption, the primary course of action is taking that server down
(hopefully you have a good replica), and do some kind of restore or sync
a new replica.

It sounds like you are exploring other possibilities.

  I suppose we could have a new ReadBufferMaybe function that would only
  be used by a sequential scan; and then just skip over the page if it's
  corrupt, depending on a GUC. That would at least allow sequential scans
  to (partially) work, which might be good enough for some data recovery
  situations. If a catalog index is corrupted, that could just be rebuilt.
  Haven't thought about the details, though.
 
 Not sure if you're being facetious here or not.

No. It was an incomplete thought (as I said), but sincere.

 Mild reworking of the
 logic for heap page access could cope with a NULL buffer response and
 subsequent looping, which would allow us to run pg_dump against a
 damaged table to allow data to be saved, keeping file intact for
 further analysis.

Right.

 I'm suggesting we work a little harder than your block is corrupt
 and give some thought to what the user will do next. Indexes are a
 good case, because we can/should report the block error, mark the
 index as invalid and then hint that it should be rebuilt.

Agreed; this applies to any derived data.

I don't think it will be very practical to keep a server running in this
state forever, but it might give enough time to reach a suitable
maintenance window.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling Checksums

2012-12-18 Thread Jeff Davis

On Tue, 2012-12-18 at 04:06 -0500, Greg Smith wrote:
 Having some way to nail down if the same block is bad on a 
 given standby seems like a useful interface we should offer, and it 
 shouldn't take too much work.  Ideally you won't find the same 
 corruption there.  I'd like a way to check the entirety of a standby for 
 checksum issues, ideally run right after it becomes current.  It seems 
 the most likely way to see corruption on one of those is to replicate a 
 corrupt block.

Part of the design is that pg_basebackup would verify checksums during
replication, so we should not replicate corrupt blocks (of course,
that's not implemented yet, so it's still a concern for now).

And we can also have ways to do background/offline checksum verification
with a separate utility.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with pg_upgrade 9.0 - 9.1

2012-12-18 Thread Bruce Momjian

On Tue, Dec 18, 2012 at 09:34:53PM +0400, Groshev Andrey wrote:
 later in the log pg_dump, I found the definition of new rel
 
 --
 -- Name: plob.ВерсияВнешнегоДокумента$Документ; Type: CONSTRAINT; Schema: 
 public; Owner: postgres; Tablespace:
 --
 
 ALTER TABLE ONLY lob.ВерсияВнешнегоДокумента$Документ
 ADD CONSTRAINT plob.ВерсияВнешнегоДокумента$Документ PRIMARY KEY 
 (@Файл, Страница);
 
 

Can you post the full definition of the table on this public email list?
Also, why did the error think this was in the public schema?  Any idea?

---


 
 
 18.12.2012, 19:38, Bruce Momjian br...@momjian.us:
  On Mon, Dec 17, 2012 at 09:21:59PM -0500, Bruce Momjian wrote:
 
   Mismatch of relation names: database database, old rel 
  public.lob.ВерсияВнешнегоДокумента$Документ_pkey, new rel 
  public.plob.ВерсияВнешнегоДокумента$Документ
   Failure, exiting
 
  I am now confused over the error message above.  This is the code that
  is generating the error:
 
  /*
   * TOAST table names initially match the heap pg_class oid.
   * In pre-8.4, TOAST table names change during CLUSTER; in pre-9.0,
   * TOAST table names change during ALTER TABLE ALTER COLUMN SET 
  TYPE.
   * In = 9.0, TOAST relation names always use heap table oids, hence
   * we cannot check relation names when upgrading from pre-9.0.
   * Clusters upgraded to 9.0 will get matching TOAST names.
   */
  if (strcmp(old_rel-nspname, new_rel-nspname) != 0 ||
  ((GET_MAJOR_VERSION(old_cluster.major_version) = 900 ||
    strcmp(old_rel-nspname, pg_toast) != 0) 
   strcmp(old_rel-relname, new_rel-relname) != 0))
  pg_log(PG_FATAL, Mismatch of relation names: database \%s\, 
     old rel %s.%s, new rel %s.%s\n,
     old_db-db_name, old_rel-nspname, old_rel-relname,
     new_rel-nspname, new_rel-relname);
 
  Looking at the Russian, I see 'old rel' public.lob.* and 'new rel'
  public.plob.*.  I assume the database is called 'database', and the
  schema is called 'public', but what is 'lob' and 'plob'?  If those are
  tables or indexes, what is after the period?  Do you have periods
  embedded in the table/index names?  That is certainly possible, but not
  common, e.g.:
 
  test= create table test.x (y int);
  CREATE TABLE
 
  Is the schema called public.lob?  I expected to see schema.objname.
 
  --
    Bruce Momjian  br...@momjian.us    http://momjian.us
    EnterpriseDB http://enterprisedb.com
 
    + It's impossible for everything to be true. +

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] Problems with enums after pg_upgrade

2012-12-18 Thread Bernhard Schrader


On 12/18/2012 09:38 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

People have been known to hack pg_enum on their own, especially before
we added enum extension.
Of course, if they do that they get to keep both pieces.

Yeah ... this would be very readily explainable if there had been a
manual deletion from pg_enum somewhere along the line.  Even if there
were at that time no instances of the OID left in tables, there could
be some in upper btree pages.  They'd have caused no trouble in 9.0
but would (if odd) cause trouble in 9.2.

Of course, this theory doesn't explain why the problem was seen on some
copies and not others cloned from the same database --- unless maybe
there had been an index page split on the master in between the
clonings, and that moved the troublesome OID into a position where it
was more likely to get compared-to.  That's not a hugely convincing
explanation though.

regards, tom lane



Guys, thank youuu ll. :) reindex helped, did reindex on two 
tables, and everything is now working like expected.


I will provide tomorrow all information which could help to understand 
everything in detail, but now it's gonna be late in germany :). and i 
got a headache of all this stuff ^^


Thanks so much!!!

--
Bernhard Schrader
System Administration

InnoGames GmbH
Harburger Schloßstraße 28 (Channel 4) - 21079 Hamburg - Germany
Tel +49 40 7889335-53
Fax +49 40 7889335-22

Managing Directors: Hendrik Klindworth, Eike Klindworth, Michael Zillmer
VAT-ID: DE264068907 Amtsgericht Hamburg, HRB 108973

http://www.innogames.com – bernhard.schra...@innogames.de




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] writable foreign tables

2012-12-18 Thread Kohei KaiGai

Hi,

2012/12/18 Ronan Dunklau rdunk...@gmail.com:
 Hello.

 I've tried to implement this API for our Multicorn FDW, and I have a few
 questions about the API.

 First, I don't understand what are the requirements on the rowid
 pseudo-column values.

 In particular, this sentence from a previous mail makes it ambiguous to me:


 At the beginning, I constructed the rowid pseudo-column using
 INTERNALOID, but it made troubles when fetched tuples are
 materialized prior to table modification.
 So, the rowid argument of them are re-defined as const char *.

 Does that mean that one should only store a cstring in the rowid
 pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
 building a text value using cstring_to_text_with_len. Could there be a
 problem with that ?

All what we require on the rowid pseudo-column is it has capability to
identify a particular row on the remote side. In case of postgres_fdw,
ctid of the relevant table is sufficient for the purpose.
I don't recommend to set the rowid field an arbitrary pointer, because
scanned tuple may be materialized between scanning and modifying
foreign table, thus, the arbitrary pointer shall be dealt under the
assumption of cstring data type.

 Secondly, how does one use a returning clause ?
 I've tried to look at the postgres_fdw code, but it does not seem to handle
 that.

 For what its worth, knowing that the postgres_fdw is still in WIP status,
 here is a couple result of my experimentation with it:

 - Insert into a foreign table mapped to a table with a before trigger,
 using a returning clause, will return the values as they were before being
 modified by the trigger.
 - Same thing,  but if the trigger prevents insertion (returning NULL), the
 would-have-been inserted row is still returned, although the insert
 statement reports zero rows.

Hmm. Do you want to see the real final contents of the rows being inserted,
don't you. Yep, the proposed interface does not have capability to modify
the supplied tuple on ExecForeignInsert or other methods.

Probably, it needs to adjust interface to allow FDW drivers to return
a modified HeapTuple or TupleTableSlot for RETURNING calculation.
(As trigger doing, it can return the given one as-is if no change)

Let me investigate the code around this topic.

 - Inserting into a table with default values does not work as intended,
 since missing values are replaced by a null value in the remote statement.

It might be a bug of my proof-of-concept portion at postgres_fdw.
The prepared INSERT statement should list up columns being actually
used only, not all ones unconditionally.

Thanks,

 What can be done to make the behaviour more consistent ?

 I'm very excited about this feature, thank you for making this possible.

 Regards,
 --
 Ronan Dunklau

 2012/12/14 Albe Laurenz laurenz.a...@wien.gv.at

 Kohei KaiGai wrote:
  I came up with one more query that causes a problem:
 [...]
  This causes a deadlock, but one that is not detected;
  the query just keeps hanging.
 
  The UPDATE in the CTE has the rows locked, so the
  SELECT ... FOR UPDATE issued via the FDW connection will hang
  indefinitely.
 
  I wonder if that's just a pathological corner case that we shouldn't
  worry about.  Loopback connections for FDWs themselves might not
  be so rare, for example as a substitute for autonomous subtransactions.
 
  I guess it is not easily possible to detect such a situation or
  to do something reasonable about it.
 
  It is not avoidable problem due to the nature of distributed database
  system,
  not only loopback scenario.
 
  In my personal opinion, I'd like to wait for someone implements
  distributed
  lock/transaction manager on top of the background worker framework being
  recently committed, to intermediate lock request.
  However, it will take massive amount of efforts to existing
  lock/transaction
  management layer, not only enhancement of FDW APIs. It is obviously out
  of scope in this patch.
 
  So, I'd like to suggest authors of FDW that support writable features to
  put
  mention about possible deadlock scenario in their documentation.
  At least, above writable CTE example is a situation that two different
  sessions
  concurrently update the test relation, thus, one shall be blocked.

 Fair enough.

  I tried to overhaul the documentation, see the attached patch.
 
  There was one thing that I was not certain of:
  You say that for writable foreign tables, BeginForeignModify
  and EndForeignModify *must* be implemented.
  I thought that these were optional, and if you can do your work
  with just, say, ExecForeignDelete, you could do that.
 
  Yes, that's right. What I wrote was incorrect.
  If FDW driver does not require any state during modification of
  foreign tables, indeed, these are not mandatory handler.

 I have updated the documentation, that was all I changed in the
 attached patches.

  OK. I split the patch into two portion, part-1 is the APIs relevant
  patch, part-2

Re: [HACKERS] Parser Cruft in gram.y

2012-12-18 Thread Robert Haas

On Tue, Dec 18, 2012 at 4:33 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 And on the other hand, if you could get a clean split between the two
 grammars, then regardless of exactly what the split was, it might seem
 a win.  But it seemed to me when I looked at this that you'd have to
 duplicate a lot of stuff and the small parser still wouldn't end up
 being very small, which I found hard to get excited about.

 I think the goal is not so much about getting a much smaller parser, but
 more about have a separate parser that you don't care about the bloat
 of, so that you can improve DDL without fearing about main parser
 performance regressions.

Well that would be nice, but the problem is that I see no way to
implement it.  If, with a unified parser, the parser is 14% of our
source code, then splitting it in two will probably crank that number
up well over 20%, because there will be duplication between the two.
That seems double-plus un-good.

I can't help but suspect that the way we handle keywords today is
monumentally inefficient.  The unreserved_keyword products, et al,
just seem somehow badly wrong-headed.  We take the trouble to
distinguish all of those cases so that we an turn around and not
distinguish them.  I feel like there ought to be some way to use lexer
states to handle this - if we're in a context where an unreserved
keyword will be treated as an IDENT, then have the lexer return IDENT
when it sees an unreserved keyword.  I might be wrong, but it seems
like that would eliminate a whole lot of parser state transitions.
However, even if I'm right, I have no idea how to implement it.  It
just seems very wasteful that we have so many parser states that have
no purpose other than (effectively) to convert an unreserved_keyword
into an IDENT when the lexer could do the same thing much more cheaply
given a bit more context.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parser Cruft in gram.y

2012-12-18 Thread Peter Eisentraut

On 12/18/12 5:10 PM, Robert Haas wrote:
 I can't help but suspect that the way we handle keywords today is
 monumentally inefficient.  The unreserved_keyword products, et al,
 just seem somehow badly wrong-headed.  We take the trouble to
 distinguish all of those cases so that we an turn around and not
 distinguish them.  I feel like there ought to be some way to use lexer
 states to handle this - if we're in a context where an unreserved
 keyword will be treated as an IDENT, then have the lexer return IDENT
 when it sees an unreserved keyword.

The problem would be the lookahead.  You need to know the next token
before you can decide what context the current one is in.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - GetOldestXmin

2012-12-18 Thread anara...@anarazel.de

Hi,

Robert Haas robertmh...@gmail.com schrieb:

On Fri, Dec 14, 2012 at 7:19 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2012-12-14 14:01:30 -0500, Robert Haas wrote:
 On Fri, Dec 14, 2012 at 6:46 AM, Andres Freund
and...@2ndquadrant.com wrote:
  Just moving that tidbit inside the lock seems to be the pragmatic
  choice. GetOldestXmin is called
 
  * once per checkpoint
  * one per index build
  * once in analyze
  * twice per vacuum
  * once for HS feedback messages
 
  Nothing of that occurs frequently enough that 5 instructions will
make a
  difference. I would be happy to go an alternative path, but right
now I
  don't see any nice one. A already_locked parameter to
GetOldestXmin
  seems to be a cure worse than the disease.

 I'm not sure that would be so bad, but I guess I question the need
to
 do it this way at all.  Most of the time, if you need to advertise
 your global xmin, you use GetSnapshotData(), not GetOldestXmin(),
and
 I guess I'm not seeing why that wouldn't also work here.  Am I dumb?

 I wondered upthread whether that would be better:

 On 2012-12-13 21:03:44 +0100, Andres Freund wrote:
 Another alternative to this would be to get a snapshot with
 GetSnapshotData(), copy the xmin to the logical slot, then call
 ProcArrayEndTransaction(). But that doesn't really seem to be nicer
to
 me.

 Not sure why I considered it ugly anymore, but it actually has a
 noticeable disadvantage. GetOldestXmin is nicer is than
GetSnapshotData
 as the latter set a fairly new xid as xmin whereas GetOldestXmin
returns
 the actual current xmin horizon. Thats preferrable because it allows
us
 to start up more quickly. snapbuild.c can only start building a
snapshot
 once it has seen a xl_running_xact with oldestRunningXid =
 own_xmin. Otherwise we cannot be sure that no relevant catalog tuples
 have been removed.

I'm a bit confused.  Are you talking about the difference between
RecentGlobalXmin and RecentXmin?  I think GetSnapshotData() updates
both.

The problem is that at the time GetSnapshotData returns the xmin horizon might 
have gone upwards and tuples required for decoding might get removed by other 
backends. That needs to be prevented while holding the  procarray lock 
exclusively.

Does it make more sense now?

Andres

--- 
Please excuse the brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parser Cruft in gram.y

2012-12-18 Thread Robert Haas

On Tue, Dec 18, 2012 at 5:24 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 12/18/12 5:10 PM, Robert Haas wrote:
 I can't help but suspect that the way we handle keywords today is
 monumentally inefficient.  The unreserved_keyword products, et al,
 just seem somehow badly wrong-headed.  We take the trouble to
 distinguish all of those cases so that we an turn around and not
 distinguish them.  I feel like there ought to be some way to use lexer
 states to handle this - if we're in a context where an unreserved
 keyword will be treated as an IDENT, then have the lexer return IDENT
 when it sees an unreserved keyword.

 The problem would be the lookahead.  You need to know the next token
 before you can decide what context the current one is in.

Yeah, that's why I don't know how to make it work.  It feels like this
is partly artifact of the tool, though.  I mean, suppose we haven't
read anything yet.  Then, the next token can't be an IDENT, so if we
see an unreserved keyword, we know we're not going to convert it to an
IDENT.  OTOH, if we've seen CREATE TABLE, the next token cannot be an
unreserved keyword that is intended as a keyword; it has to be
something that will reduce to ColId.

I guess the problem situation is where we can shift the keyword and
then use the following token to decide whether to reduce it to
ColId/type_function_name/ColLabel or use some other rule instead; the
CREATE INDEX CONCURRENTLY productions might be such a case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Noah Misch

On Tue, Dec 18, 2012 at 12:09:10PM -0500, Peter Eisentraut wrote:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:
 
 pg_reload_conf
 pg_rotate_logfile
 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir
 
 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?

+1.  You can already use a SECURITY DEFINER wrapper, so I don't think this
opens any particular floodgate.  GRANT is a nicer interface.  However, I would
not advertise this as a replacement for wrapper functions until pg_dump can
preserve ACL changes to pg_catalog objects.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - GetOldestXmin

2012-12-18 Thread Robert Haas

On Tue, Dec 18, 2012 at 5:25 PM, anara...@anarazel.de
and...@anarazel.de wrote:
 The problem is that at the time GetSnapshotData returns the xmin horizon 
 might have gone upwards and tuples required for decoding might get removed by 
 other backends. That needs to be prevented while holding the  procarray lock 
 exclusively.

Well, for the ordinary use of GetSnapshotData(), that doesn't matter,
because GetSnapshotData() also updates proc-xmin.  If you're trying
to store a different value in that field then of course it matters.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Robert Haas

On Tue, Dec 18, 2012 at 7:41 PM, Noah Misch n...@leadboat.com wrote:
 On Tue, Dec 18, 2012 at 12:09:10PM -0500, Peter Eisentraut wrote:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:

 pg_reload_conf
 pg_rotate_logfile
 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir

 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?

 +1.  You can already use a SECURITY DEFINER wrapper, so I don't think this
 opens any particular floodgate.  GRANT is a nicer interface.  However, I would
 not advertise this as a replacement for wrapper functions until pg_dump can
 preserve ACL changes to pg_catalog objects.

Yeah.  That is a bit of a foot-gun to this approach, although I too
agree on the general theory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Tomas Vondra

On 18.12.2012 18:38, Pavel Stehule wrote:
 2012/12/18 Peter Eisentraut pete...@gmx.net:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:

 pg_reload_conf
 pg_rotate_logfile

 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?
 
 isn't it too strong gun for some people ???
 
 I believe so some one can decrease necessary rights and it opens doors
 to system.

No one was speaking about making them executable by a wider group of
users by default (i.e. decreasing necessary rights). Today, when you
need to provide the EXECUTE privilege on those functions, you have three
options

(a) make him a superuser - obviously not a good choice

(b) create a SECURITY DEFINER wrapper **for each function separately**

(c) deny to do that


Being able to do a plain GRANT on the function is merely a simpler way
to do (b). It has advantages (less objects/functions to care about) and
disadvantages (e.g. you can't do additional parameter values checks).

 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir
 
 is relative dangerous and I am not for opening these functions.
 
 power user can simply to write extension, but he knows what he does/

I see only dangers that are already present.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - GetOldestXmin

2012-12-18 Thread Andres Freund

On 2012-12-18 19:56:18 -0500, Robert Haas wrote:
 On Tue, Dec 18, 2012 at 5:25 PM, anara...@anarazel.de
 and...@anarazel.de wrote:
  The problem is that at the time GetSnapshotData returns the xmin horizon 
  might have gone upwards and tuples required for decoding might get removed 
  by other backends. That needs to be prevented while holding the  procarray 
  lock exclusively.

 Well, for the ordinary use of GetSnapshotData(), that doesn't matter,
 because GetSnapshotData() also updates proc-xmin.  If you're trying
 to store a different value in that field then of course it matters.

Absolutely right. I don't want to say there's anything wrong with it
right now. The problem for me is that it sets proc-xmin to the newest
value it can while I want/need the oldest valid value...

I will go with adding a already_locked parameter to GetOldestXmin then.

Thanks for the input,

Andres
--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2012-12-18 Thread Jeff Janes

[moved to hackers]

On Wednesday, December 5, 2012, Tom Lane wrote:

 Jeff Janes jeff.ja...@gmail.com javascript:; writes:
  I now see where the cost is coming from.  In commit 21a39de5809 (first
  appearing in 9.2) the fudge factor cost estimate for large indexes
  was increased by about 10 fold, which really hits this index hard.

  This was fixed in commit bf01e34b556 Tweak genericcostestimate's
  fudge factor for index size, by changing it to use the log of the
  index size.  But that commit probably won't be shipped until 9.3.

 Hm.  To tell you the truth, in October I'd completely forgotten about
 the January patch, and was thinking that the 1/1 cost had a lot
 of history behind it.  But if we never shipped it before 9.2 then of
 course that idea is false.  Perhaps we should backpatch the log curve
 into 9.2 --- that would reduce the amount of differential between what
 9.2 does and what previous branches do for large indexes.


I think we should backpatch it for 9.2.3.  I've seen another email which is
probably due to the same issue (nested loop vs hash join).  And some
monitoring of a database I am responsible for suggests it might be heading
in that direction as well as the size grows.

But I am wondering if it should be present at all in 9.3.  When it was
introduced, the argument seemed to be that smaller indexes might be easier
to keep in cache.  And surely that is so.  But if a larger index that
covers the same type of queries exists when a smaller one also exists, we
can assume the larger one also exists for a reason.  While it may be easier
to keep a smaller index in cache, it is not easier to keep both a larger
and a smaller one in cache as the same time.  So it seems to me that this
reasoning is a wash.  (Countering this argument is that a partial index is
more esoteric, and so if one exists it is more likely to have been
well-thought out)

The argument for increasing the penalty by a factor of 10 was that the
smaller one could be swamped by noise such as page-boundary-roundoff
behavior.  I don't really know what that means, but it seems to me that if
it is so easily swamped by noise, then it probably isn't so important in
the first place which one it chooses.  Whereas, I think that even the log
based penalty has the risk of being too much on large indexes.  (For one
thing, it implicitly assumes the fan-out ratio at each level of btree is e,
when it will usually be much larger than e.)

One thing which depends on the index size which, as far as I can tell, is
not currently being counted is the cost of comparing the tuples all the way
down the index.  This would be proportional to log2(indextuples) *
cpu_index_tuple_cost, or maybe log2(indextuples) *
(cpu_index_tuple_cost+cpu_operator_cost), or something like that.  This
cost would depend on the number index tuples, not baserel tuples, and so
would penalize large indexes.  It would be much smaller than the current
log(pages/1) penalty, but it would be more principle-based rather than
heuristic-based.

The log(pages/1) change is more suitable for back-patching because it
is more conservative, being asymptotic with the previous behavior at the
low end.  But I don't think that the case for that previous behavior was
ever all that strong.

If we really want a heuristic to give a bonus to partial indexes, maybe we
should explicitly give them a bonus, rather than penalizing ordinary
indexes (which penalty is then used in comparing them to hash joins and
such, not just partial indexes).

maybe something like bonus = 0.05 * (reltuples-indextuples)/reltuples

Cheers,

Jeff

Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-12-18 Thread Andres Freund

On 2012-12-17 00:31:00 +0100, Tomas Vondra wrote:
 I've updated the patch to include the optimization described in the
 previous post, i.e. if the number of relations is below a certain
 threshold, use a simple for loop, for large numbers of relations use
 bsearch calls.

 This is done by a new constant BSEARCH_LIMIT, which is set to 10 in the
 patch. Then I've modified the 'drop-test' script to take yet another
 argument - number of indexes for each table. So for example this

 ./drop-test.py 1 100 3 'dbname=test'

 means create 1 tables, 3 indexes for each of them, and then drop
 them in batches of 100 tables.

 Then I've run the test with 0, 1, 2, ... 11 indexes for each table for

 (a) unpatched HEAD
 (b) patch v3.1 (without the optimization)
 (c) patch v3.3 (with BSEARCH_LIMIT=10)

 and I got these results:

Nice work!

 1) dropping one-by-one
 --

 This is the really interesting part - the issue with the v3.1 is that
 for a single table, it's ~2x slower than unpatched PostgreSQL.

  0   1   2   3   4   5   6   789   10   11
 --
 unpatched   16  28  40  52  63  75  87  99  110  121  135  147
  v3.1   33  43  46  56  58  60  63  72   75   76   79   80
  v3.3   16  20  23  25  29  33  36  40   44   47   79   82

 The values are durations in seconds, rounded to integer values. I've run
 the test repeatedly and there's very small variance in the numbers.

 The v3.3 improves that and it's actually even faster than unpatched
 PostgreSQL. How can this happen? I believe it's because the code is
 rewritten from

for each relation (r) in the drop list
   DropRelFileNodeAllBuffers (r)
  for each shared buffer
 check and invalidate

 to

copy the relations from drop list to array (a)
DropRelFileNodeAllBuffers(a)
   for each shared buffer
   for each relation (r) in the array
   check and invalidate

 At least that's the only explanation I was able to come up with.

That seems like a rather sensible explanation. The earlier algorithm
iterated over shared buffers multiple times which is the expensive thing
here, the new version doesn't so its sensible that its faster as long as
the comparison method for checking whether a buffer belongs to relation
is suitably fast.

 Yet another interesting observation is that even the v3.1 is about as
 fast as the unpatched code once there are 3 or more indexes (or TOAST
 tables).

 So my opinion is that the optimizated patch works as expected, and that
 even without the optimization the performance would be acceptable for
 most real-world cases.

I think except of the temp buffer issue mentioned below its ready.

 -DropRelFileNodeAllBuffers(RelFileNodeBackend rnode)
 +DropRelFileNodeAllBuffers(RelFileNodeBackend * rnodes, int nnodes)
  {
 - int i;
 + int i, j;
 +
 + /* sort the list of rnodes */
 + pg_qsort(rnodes, nnodes, sizeof(RelFileNodeBackend), rnode_comparator);

   /* If it's a local relation, it's localbuf.c's problem. */
 - if (RelFileNodeBackendIsTemp(rnode))
 + for (i = 0; i  nnodes; i++)
   {
 - if (rnode.backend == MyBackendId)
 - DropRelFileNodeAllLocalBuffers(rnode.node);
 - return;
 + if (RelFileNodeBackendIsTemp(rnodes[i]))
 + {
 + if (rnodes[i].backend == MyBackendId)
 + DropRelFileNodeAllLocalBuffers(rnodes[i].node);
 + }
   }

While you deal with local buffers here you don't anymore in the big loop
over shared buffers. That wasn't needed earlier since we just returned
after noticing we have local relation, but thats not the case anymore.

   for (i = 0; i  NBuffers; i++)
   {
 + RelFileNodeBackend *rnode = NULL;
   volatile BufferDesc *bufHdr = BufferDescriptors[i];
 -
 +
   /*
* As in DropRelFileNodeBuffers, an unlocked precheck should be 
 safe
* and saves some cycles.
*/
 - if (!RelFileNodeEquals(bufHdr-tag.rnode, rnode.node))
 +
 + /*
 +  * For low number of relations to drop just use a simple walk 
 through,
 +  * to save the bsearch overhead. The BSEARCH_LIMIT is rather a 
 guess
 +  * than a exactly determined value, as it depends on many 
 factors (CPU
 +  * and RAM speeds, amount of shared buffers etc.).
 +  */
 + if (nnodes = BSEARCH_LIMIT)

I think thats a sensible plan. It makes sense that for a small number of
relations a sequential scan of the rnodes array is faster than a bsearch
and 10 sounds like a good value although I would guess the optimal value
is slightly higher on most machines. But if it works fine without
regressions thats pretty good...

 +
 +/*
 + * Used to sort relfilenode array

Re: [HACKERS] Patch for checking file parameters to psql before password prompt

2012-12-18 Thread Josh Kupershmidt

On Sun, Dec 2, 2012 at 4:37 AM, Alastair Turner b...@ctrlf5.co.za wrote:
 Patch for the changes discussed in
 http://archives.postgresql.org/pgsql-hackers/2010-10/msg00919.php
 attached (eventually ...)

 In summary: If the input file (-f) doesn't exist or the ouput or log
 files (-o and -l) can't be created psql exits before prompting for a
 password.

I assume you meant -L instead of -l here for specifying psql's log
file. I don't think that the inability to write to psql's log file
should be treated as a fatal error, especially since it is not treated
as such by the current code:

  $ psql test -L /tmp/not_allowed
  psql: could not open log file /tmp/not_allowed: Permission denied
  [... proceeds to psql prompt from here ...]

and the user (or script) may still usefully perform his work. Whereas
with your patch:

  $ psql test -L /tmp/not_allowed
  psql: could not open log file /tmp/not_allowed: Permission denied
  $

And IMO the same concern applies to the query results file, -o.
Although +1 for the part about having psql exit early if the input
filename does not exist, since psql already bails out in this case,
and there is nothing else to be done in such case.

Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Makefiles don't seem to remember to rebuild everything anymore

2012-12-18 Thread Robert Haas

On Mon, Dec 17, 2012 at 1:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Now perhaps this is not make's fault so much as a lack of adequate
 dependency specifications.  It may be that we can still use .SECONDARY
 if we add the $(OBJS) lists as explicit targets of make all in backend
 directories, but I'm not sure how invasive that would be.

 I experimented a bit with this:

 diff --git a/src/backend/common.mk b/src/backend/common.mk
 index 2e56151..822b1e9 100644
 *** a/src/backend/common.mk
 --- b/src/backend/common.mk
 *** SUBDIROBJS = $(SUBDIRS:%=%/$(subsysfilen
 *** 20,26 

   # top-level backend directory obviously has its own all target
   ifneq ($(subdir), src/backend)
 ! all: $(subsysfilename)
   endif

   SUBSYS.o: $(SUBDIROBJS) $(OBJS)
 --- 20,26 

   # top-level backend directory obviously has its own all target
   ifneq ($(subdir), src/backend)
 ! all: $(subsysfilename) $(OBJS)
   endif

   SUBSYS.o: $(SUBDIROBJS) $(OBJS)

 which seems to fix the main issue, but it's still a bit wonky as far
 as making objfiles.txt goes:

 $ cd pgsql/src/backend/parser/
 $ rm analyze.o
 rm: remove regular file `analyze.o'? y
 $ make
 gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
 -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
 -I. -I. -I../../../src/include -D_GNU_SOURCE   -c -o analyze.o analyze.c
 $ make
 touch objfiles.txt
 $ make
 make: Nothing to be done for `all'.

 This is definitely not per make's contract, either.  I think maybe the
 Don't rebuild the list if only the OBJS have changed hack in common.mk
 is a brick or two shy of a load, but I don't know how to fix that.

I feel like it's been this way for a while - at least I feel like I've
noticed this before.  I think there is some inevitable kludginess
around having one makefile per subdirectory that leads to these kinds
of issues.  Maybe we should get rid of all the makefiles under
src/backend except for the top-level one and just do everything there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - GetOldestXmin

2012-12-18 Thread Robert Haas

On Tue, Dec 18, 2012 at 7:59 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2012-12-18 19:56:18 -0500, Robert Haas wrote:
 On Tue, Dec 18, 2012 at 5:25 PM, anara...@anarazel.de
 and...@anarazel.de wrote:
  The problem is that at the time GetSnapshotData returns the xmin horizon 
  might have gone upwards and tuples required for decoding might get removed 
  by other backends. That needs to be prevented while holding the  procarray 
  lock exclusively.

 Well, for the ordinary use of GetSnapshotData(), that doesn't matter,
 because GetSnapshotData() also updates proc-xmin.  If you're trying
 to store a different value in that field then of course it matters.

 Absolutely right. I don't want to say there's anything wrong with it
 right now. The problem for me is that it sets proc-xmin to the newest
 value it can while I want/need the oldest valid value...

 I will go with adding a already_locked parameter to GetOldestXmin then.

Or instead of bool already_locked, maybe bool advertise_xmin?   Seems
like that might be more friendly to the abstraction boundaries.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-12-18 Thread Andres Freund

On 2012-12-17 11:44:00 +0900, Michael Paquier wrote:
 Thanks for all your comments.
 The new version (v5) of this patch fixes the error you found when
 reindexing indexes being referenced in foreign keys.
 The fix is done with switchIndexConstraintOnForeignKey:pg_constraint.c, in
 charge of scanning pg_constraint for foreign keys that refer the parent
 relation (confrelid) of the index being swapped and then switch conindid to
 the new index if the old index was referenced.
 This API also takes care of switching the dependency between the foreign
 key and the old index by calling changeDependencyFor.
 I also added a regression test for this purpose.

Ok. Are there no other depencencies towards indexes? I don't know of any
right now, but I have the feeling there were some other cases.

 On Tue, Dec 11, 2012 at 12:28 AM, Andres Freund and...@2ndquadrant.comwrote:

  Some review comments:
 
  * Some of the added !is_reindex in index_create don't seem safe to
me.
 
 This is added to control concurrent index relation for toast indexes. If we
 do not add an additional flag for that it will not be possible to reindex
 concurrently a toast index.

I think some of them were added for cases that didn't seem to be related
to that. I'll recheck in the current version.

  * Why do we now support reindexing exclusion constraints?
 
 CREATE INDEX CONCURRENTLY is not supported for exclusive constraints but I
 played around with exclusion constraints with my patch and did not
 particularly see any problems in supporting them as for example index_build
 performs a second scan of the heap when running so it looks enough solid
 for that. Is it because the structure of REINDEX CONCURRENTLY patch is
 different? Honestly I think no so is there something I am not aware of?

I think I asked because you had added an  !is_reindex to one of the
checks.

If I recall the reason why concurrent index builds couldn't support
exclusion constraints correctly - namely that we cannot use them to
check for new row versions when the index is in the ready  !valid
state - that shouldn't be a problem when we have a valid version of an
old index arround because that enforces everything. It would maybe need
an appropriate if (!isvalid) in the exclusion constraint code, but that
should be it.

 * REINDEX DATABASE .. CONCURRENTLY doesn't work, a variant that does the
concurrent reindexing for user-tables and non-concurrent for system
tables would be very useful. E.g. for the upgrade from 9.1.5-9.1.6...
 
 OK. I thought that this was out of scope for the time being. I haven't done
 anything about that yet. Supporting that will not be complicated as
 ReindexRelationsConcurrently (new API) is more flexible now, the only thing
 needed is to gather the list of relations that need to be reindexed.

Imo that so greatly reduces the usability of this patch that you should
treat it as in scope ;). Especially as you say, it really shouldn't be
that much work with all the groundwork built.

  * would be nice (but thats probably a step #2 thing) to do the
individual steps of concurrent reindex over multiple relations to
avoid too much overall waiting for other transactions.
 
 I think I did that by now using one transaction per index for each
 operation except the drop phase...

Without yet having read the new version, I think thats not what I
meant. There currently is a wait for concurrent transactions to end
after most of the phases for every relation, right? If you have a busy
database with somewhat longrunning transactions thats going to slow
everything down with waiting quite bit. I wondered whether it would make
sense to do PHASE1 for all indexes in all relations, then wait once,
then PHASE2...
That obviously has some space and index maintainece overhead issues, but
its probably sensible anyway in many cases.


* PHASE 6 should acquire exlusive locks on the indexes
 
 The necessary lock is taken when calling index_drop through
 performMultipleDeletion. Do you think it is not enough and that i should
 add an Exclusive lock inside index_concurrent_drop?

It seems to be safer to acquire it earlier, otherwise the likelihood for
deadlocks seems to be slightly higher as youre increasing the lock
severity. And it shouldn't cause any disadvantages,s o ...

Starts to look really nice now!

Isn't the following block content thats mostly available somewhere else
already?

 +  refsect2 id=SQL-REINDEX-CONCURRENTLY
 +   title id=SQL-REINDEX-CONCURRENTLY-titleRebuilding Indexes 
 Concurrently/title
 +
 +   indexterm zone=SQL-REINDEX-CONCURRENTLY
 +   primaryindex/primary
 +   secondaryrebuilding concurrently/secondary
 +   /indexterm
 +
 +   para
 +Rebuilding an index can interfere with regular operation of a database.
 +Normally productnamePostgreSQL/ locks the table whose index is 
 rebuilt
 +against writes and performs the entire index build with a single scan of 
 the
 +table. Other transactions can still read the table, but if

Re: [HACKERS] Switching timeline over streaming replication

2012-12-18 Thread Josh Berkus

Heikki,

I ran into an unexpected issue while testing.  I just wanted to fire up
a chain of 5 replicas to see if I could connect them in a loop.
However, I ran into a weird issue when starting up r3: it refused to
come out of the database is starting up mode until I did a write on
the master.  Then it came up fine.

master--r1--r2--r3--r4

I tried doing the full replication sequence (basebackup, startup, test)
with it twice and got the exact same results each time.

This is very strange because I did not encounter the same issues with r2
or r4.  Nor have I seen this before in my tests.

I'm also seeing Thom's spurious error message now.  Each of r2, r3 and
r4 have the following message once in their logs:

LOG:  database system was interrupted while in recovery at log time
2012-12-19 02:49:34 GMT
HINT:  If this has occurred more than once some data might be corrupted
and you might need to choose an earlier recovery target.

This message doesn't seem to signify anything.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Cascading replication: should we detect/prevent cycles?

2012-12-18 Thread Josh Berkus

Folks,

So as a test I tried to connect a group of 9.3 streaming replicas in a
circle (4 replicas).  This was very easy to do:

1. create r1 as replica of master
2. create r2 as replica of r1
3. create r3 as replica of r2
4. create r4 as replica of r3
5. start traffic on master
6. shut down r1
7. point r1 to r4 in recovery.conf
8. start r1

replicas are now successfully connected in this pattern:

r1 --- r2
 ^   |
 |   |
 |   v
r4 --- r3

pg_stat_replication displays the correct information on each replica.

So, my question is:

1. should we detect for replication cycles?  *Can* we?
2. should we warn the user, or refuse to start up?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Feature Request: pg_replication_master()

2012-12-18 Thread Josh Berkus

Hackers,

Currently we can see each master's current replicas using
pg_stat_replication.  However, there is no way from a replica, that I
know of, to figure out who its master is other than to look at
recovery.conf.

We should probably have a function, like pg_replication_master(), which
gives the host address of the current master.  This would help DBAs for
large replication clusters a lot.  Obviously, this would only work in
streaming.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Makefiles don't seem to remember to rebuild everything anymore

2012-12-18 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 17, 2012 at 1:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 This is definitely not per make's contract, either.  I think maybe the
 Don't rebuild the list if only the OBJS have changed hack in common.mk
 is a brick or two shy of a load, but I don't know how to fix that.

 I feel like it's been this way for a while - at least I feel like I've
 noticed this before.  I think there is some inevitable kludginess
 around having one makefile per subdirectory that leads to these kinds
 of issues.  Maybe we should get rid of all the makefiles under
 src/backend except for the top-level one and just do everything there.

I mentioned this paper last year, but maybe it's time to start
taking it seriously:
http://aegis.sourceforge.net/auug97.pdf

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

2012-12-18 Thread Jeevan Chalke

On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra t...@fuzzy.cz wrote:

 Hi,

 attached is a new version of the patch that

 (a) converts the 'log_step_seconds' variable to a constant (and does
 not allow changing it using a command-line option etc.)

 (b) keeps the current logging as a default

 (c) adds a -q switch that enables the new logging with a 5-second
 interval

 I'm still not convinced there should be yet another know for tuning the
 log interval - opinions?


It seems that you have generated a patch over your earlier version and due
to that it is not cleanly applying on fresh sources.
Please generate patch on fresh sources.

However, I absolutely no issues with the design. Also code review is
already done and looks good to me.
I think to move forward on this we need someone from core-team. So I am
planning to change its status to ready-for-committor.

Before that please provide updated patch for final code review.

Thanks




 Tomas

 On 11.12.2012 10:23, Jeevan Chalke wrote:
 
 
 
  On Sun, Dec 9, 2012 at 8:11 AM, Tomas Vondra t...@fuzzy.cz
  mailto:t...@fuzzy.cz wrote:
 
  On 20.11.2012 08:22, Jeevan Chalke wrote:
   Hi,
  
  
   On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra t...@fuzzy.cz
  mailto:t...@fuzzy.cz
   mailto:t...@fuzzy.cz mailto:t...@fuzzy.cz wrote:
  
   On 19.11.2012 11:59, Jeevan Chalke wrote:
Hi,
   
I gone through the discussion for this patch and here is my
  review:
   
The main aim of this patch is to reduce the number of log
  lines. It is
also suggested to use an options to provide the interval but
  few of us
are not much agree on it. So final discussion ended at
  keeping 5 sec
interval between each log line.
   
However, I see, there are two types of users here:
1. Who likes these log lines, so that they can troubleshoot
 some
slowness and all
2. Who do not like these log lines.
  
   Who likes these lines / needs them for something useful?
  
  
   No idea. I fall in second category.
  
   But from the discussion, I believe some people may need detailed
  (or lot
   more) output.
 
  I've read the thread again and my impression is that no one really
 needs
  or likes those lines, but
 
(1) it's rather pointless to print a message every 100k rows, as it
usually fills the console with garbabe
 
(2) it's handy to have regular updates of the progress
 
  I don't think there're people (in the thread) that require to keep
 the
  current amount of log messages.
 
  But there might be users who actually use the current logs to do
  something (although I can't imagine what). If we want to do this in a
  backwards compatible way, we should probably use a new option (e.g.
  -q) to enable the new (less verbose) logging.
 
  Do we want to allow both types of logging, or shall we keep only the
 new
  one? If both, which one should be the default one?
 
 
  Both the options are fine with me, but the default should be the current
  behaviour.
 
 
So keeping these in mind, I rather go for an option which
  will control
this. People falling in category one can set this option to
  very low
where as users falling under second category can keep it
 high.
  
   So what option(s) would you expect? Something that tunes the
  interval
   length or something else?
  
  
   Interval length.
 
  Well, I can surely imagine something like --interval N.
 
 
  +1
 
 
 
   A switch that'd choose between the old and new behavior might
  be a good
   idea, but I'd strongly vote against automagic heuristics. It
  makes the
   behavior very difficult to predict and I really don't want to
  force the
   users to wonder whether the long delay is due to general
  slowness of the
   machine or some clever logic that causes long delays between
 log
   messages.
  
   That's why I choose a very simple approach with constant time
  interval.
   It does what I was aiming for (less logs) and it's easy to
  predict.
   Sure, we could choose different interval (or make it an
 option).
  
  
   I am preferring an option for choosing an interval, say from 1
  second to
   10 seconds.
 
  U, why not to allow arbitrary integer? Why saying 1 to 10
 seconds?
 
 
  Hmm.. actually, I have no issues with any number there. Just put 1..10
  as we hard-coded it 5. No particular reason as such.
 
 
 
   BTW, what if, we put one log message every 10% (or 5%) with time
 taken
   (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
   This will have only 10 (or 20) lines per pgbench

Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-18 Thread Magnus Hagander

On Dec 19, 2012 4:43 AM, Josh Berkus j...@agliodbs.com wrote:

 Hackers,

 Currently we can see each master's current replicas using
 pg_stat_replication.  However, there is no way from a replica, that I
 know of, to figure out who its master is other than to look at
 recovery.conf.

 We should probably have a function, like pg_replication_master(), which
 gives the host address of the current master.  This would help DBAs for
 large replication clusters a lot.  Obviously, this would only work in
 streaming.

This sounds like my previous suggestion of returning the primary conninfo
value, but with just ip. That one came with a pretty bad patch, and was
later postponed until we folded recovery.conf into the main configuration
file parsing. I'm not really sure what happened to that project? (the
configuration file one)

/Magnus

Re: [HACKERS] system administration functions with hardcoded superuser checks

2012-12-18 Thread Magnus Hagander

On Wed, Dec 19, 2012 at 1:58 AM, Tomas Vondra t...@fuzzy.cz wrote:
 On 18.12.2012 18:38, Pavel Stehule wrote:
 2012/12/18 Peter Eisentraut pete...@gmx.net:
 There are some system administration functions that have hardcoded
 superuser checks, specifically:

 pg_reload_conf
 pg_rotate_logfile

 Some of these are useful in monitoring or maintenance tools, and the
 hardcoded superuser checks require that these tools run with maximum
 privileges.  Couldn't we just install these functions without default
 privileges and allow users to grant privileges as necessary?

 isn't it too strong gun for some people ???

 I believe so some one can decrease necessary rights and it opens doors
 to system.

 No one was speaking about making them executable by a wider group of
 users by default (i.e. decreasing necessary rights). Today, when you
 need to provide the EXECUTE privilege on those functions, you have three
 options

Given how limited these functions are in scope, I don't see a problem here.

 pg_read_file
 pg_read_file_all
 pg_read_binary_file
 pg_read_binary_file_all
 pg_stat_file
 pg_ls_dir

 is relative dangerous and I am not for opening these functions.

 power user can simply to write extension, but he knows what he does/

 I see only dangers that are already present.

Granting executability on pg_read_xyz is pretty darn close to granting
superuser, without explicitly asking for it. Well, you get read only
superuser. If we want to make that step as easy as just GRANT, we
really need to write some *very* strong warnings in the documentation
so that people realize this. I doubt most people will realize it
unless we do that (and those who don't read the docs, whch is probably
a majority, never will).

If you use SECURITY DEFINER, you can limit the functions to *the
specific files that you want to grant read on*. Which makes it
possible to actually make it secure. E.g. you *don't* have to give
full read to your entire database.

If you're comparing it to a blanket SECURITY DEFINER with no checks,
then yes, it's a simpler way to fire the cannon into your own foot,
yes. But if also gives you a way that makes it more likely that you
don't *realize* that you're about to fire a cannon into your foot.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal - assign result of query to psql variable

2012-12-18 Thread Shigeru Hanada

On Tue, Dec 18, 2012 at 2:52 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Attached updated patch

Seems fine.  I'll mark this as ready for committer.

Nice work!

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2012-12-18 Thread Simon Riggs

On 19 December 2012 03:03, Josh Berkus j...@agliodbs.com wrote:

 So, my question is:

 1. should we detect for replication cycles?  *Can* we?
 2. should we warn the user, or refuse to start up?

Why not just monitor the config you just created? Anybody that
actually tests their config would spot this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

72 matches

Mail list logo