Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-01 Thread hubert depesz lubaczewski
On Wed, Aug 31, 2011 at 09:54:20PM -0400, Bruce Momjian wrote:
 Working with depesz, I have found the cause.  The code I added to fix
 pg_upgrade in 9.0.4 and earlier releases didn't handle old 8.3 servers
 properly.  I mistakenly processed toast table with the same pg_dump
 query as used for pre-8.4 toast tables, not realizing those were not
 functional because there were no reloptions for toast tables in pre-8.4.

Thanks a lot. Will test and post results (around sunday/monday I guess).

Best regards,

depesz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-01 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 Well, we could make initdb patch it up, but that might seem excessive.

I sometime wonder if archive_mode shouldn't default to on with the
archive_command set to either '/bin/true' or 'rem' for windows.

That allows to install proper archiving without restart, but the
tradeoff is of course that you need to restart to enable some
optimisation cases by turning archive_mode off.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] setlocale() on Windows is broken

2011-09-01 Thread Heikki Linnakangas

On 31.08.2011 16:05, Heikki Linnakangas wrote:

While looking through old emails, I bumped into this:

http://archives.postgresql.org/message-id/25219.1303306...@sss.pgh.pa.us

To recap, setlocale() on Windows is broken for locale names that contain
dots or apostrophes in the country name. That includes Hong Kong
S.A.R., Macau S.A.R., and U.A.E. and People's Republic of China.

In April, I put in a hack to initdb to map those problematic names to
aliases that don't contain dots:

People's Republic of China - China
Hong Kong S.A.R. - HKG
U.A.E. - ARE
Macau S.A.R. - ZHM

However, Hiroshi pointed out in the thread linked above that that
doesn't completely solve the problem. If you set locale to HKG, for
example, setlocale(LC_ALL, NULL) still returns the full name, Hong Kong
S.A.R., and if you feed that back to setlocale() it fails. In
particular, check_locale() uses saved = setlocale(LC_XXX, NULL) to get
the current value, and tries to restore it later with setlocale(LC_XXX,
saved).


At first, I thought I should revert my hack in initdb, since it's not
fully solving the problem anyway. But it doesn't really help - you run
into the same issue if you set locale to one of those aliases manually.
And that's exactly what users will have to do if we don't map those
locales automatically.

Microsoft should fix their bug. I don't have much faith in that
happening, however. So, I think we should move the mapping from initdb
to somewhere in src/port, so that the mapping is done every time
setlocale() is called. That would fix the problem with check_locale():
even though setlocale(LC_XXX, NULL) returns a value that won't work,
the setlocale() call to restore it would map it to an alias that does
work again.

In addition to that, I think we should check the return value of
setlocale() in check_locale(), and throw a warning if restoring the old
locale fails. The session's locale will still be screwed, but at least
you'll know if it happens.


I've committed a patch along those lines.

It turned out to be pretty difficult to reproduce user-visible buggy 
behavior caused by this bug, so for the sake of the archives, here's a 
recipe on that:


1. Set system locale to Chinese_Hong Kong S.A.R..950

2. initdb -D data --locale=Arabic_ARE

3. Launch psql.

  CREATE TABLE foo (a text);
  INSERT INTO foo VALUES ('a'), ('A');

  -- Verify that the order is 'a', 'A'
  SELECT * FROM foo ORDER BY a;

  -- This fails, as it should
  CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0;

  -- This also fails, as it should
  CREATE DATABASE postgres WITH LC_COLLATE='C' TEMPLATE=template0;

  -- The order returned by this is now wrong: 'A', 'a'
  SELECT * FROM foo ORDER BY a;

It's a bizarre looking sequence, but that does it.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-09-01 Thread Heikki Linnakangas

On 30.08.2011 13:38, Alexander Korotkov wrote:

On Tue, Aug 30, 2011 at 1:08 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:



Thanks. Meanwhile, I hacked together my own set of test scripts, and let
them run over the weekend. I'm still running tests with ordered data, but
here are some preliminary results:

   testname   |   nrows   |duration | accesses
-+**---+-+**--
  points unordered auto   | 25000 | 08:08:39.174956 |  3757848
  points unordered buffered   | 25000 | 09:29:16.47012  |  4049832
  points unordered unbuffered | 25000 | 03:48:10.999861 |  4564986

As you can see, the results are very disappointing :-(. The buffered builds
take a lot *longer* than unbuffered ones. I was expecting the buffering to
be very helpful at least in these unordered tests. On the positive side, the
buffering made index quality somewhat better (accesses column, smaller is
better), but that's not what we're aiming at.

What's going on here? This data set was large enough to not fit in RAM, the
table was about 8.5 GB in size (and I think the index is even larger than
that), and the box has 4GB of RAM. Does the buffering only help with even
larger indexes that exceed the cache size even more?


This seems pretty strange for me. Time of unbuffered index build shows that
there is not bottleneck at IO. That radically differs from my
experiments. I'm going to try your test script on my test setup.
While I have only express assumption that random function appears to be
somewhat bad. Thereby unordered dataset behave like the ordered one. Can you
rerun tests on your test setup with dataset generation on the backend like
this?
CREATE TABLE points AS (SELECT point(random(), random() FROM
generate_series(1,1000));


So I changed the test script to generate the table as:

CREATE TABLE points AS SELECT random() as x, random() as y FROM 
generate_series(1, $NROWS);


The unordered results are in:

  testname   |   nrows   |duration | accesses
-+---+-+--
 points unordered buffered   | 25000 | 05:56:58.575789 |  2241050
 points unordered auto   | 25000 | 05:34:12.187479 |  2246420
 points unordered unbuffered | 25000 | 04:38:48.663952 |  2244228

Although the buffered build doesn't lose as badly as it did with more 
overlap, it still doesn't look good :-(. Any ideas?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-09-01 Thread Alexander Korotkov
On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 So I changed the test script to generate the table as:

 CREATE TABLE points AS SELECT random() as x, random() as y FROM
 generate_series(1, $NROWS);

 The unordered results are in:

  testname   |   nrows   |duration | accesses
 -+**---+-+**--
  points unordered buffered   | 25000 | 05:56:58.575789 |  2241050
  points unordered auto   | 25000 | 05:34:12.187479 |  2246420
  points unordered unbuffered | 25000 | 04:38:48.663952 |  2244228

 Although the buffered build doesn't lose as badly as it did with more
 overlap, it still doesn't look good :-(. Any ideas?


But it's still a lot of overlap. It's about 220 accesses per small area
request. It's about 10 - 20 times greater than should be without overlaps.
If we roughly assume that 10 times more overlap makes 1/10 of tree to be
used for actual inserts, then that part of tree can easily fit to the cache.
You can try my splitting algorithm on your test setup (it this case I advice
to start from smaller number of rows, 100 M for example).
I'm requesting real-life datasets which makes troubles in real life from
Oleg. Probably those datasets is even larger or new linear split produce
less overlaps on them.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] WIP: Fast GiST index build

2011-09-01 Thread Heikki Linnakangas

On 01.09.2011 12:23, Alexander Korotkov wrote:

On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


So I changed the test script to generate the table as:

CREATE TABLE points AS SELECT random() as x, random() as y FROM
generate_series(1, $NROWS);

The unordered results are in:

  testname   |   nrows   |duration | accesses
-+**---+-+**--
  points unordered buffered   | 25000 | 05:56:58.575789 |  2241050
  points unordered auto   | 25000 | 05:34:12.187479 |  2246420
  points unordered unbuffered | 25000 | 04:38:48.663952 |  2244228

Although the buffered build doesn't lose as badly as it did with more
overlap, it still doesn't look good :-(. Any ideas?



But it's still a lot of overlap. It's about 220 accesses per small area
request. It's about 10 - 20 times greater than should be without overlaps.


Hmm, those accesses numbers are actually quite bogus for this test. I 
changed the creation of the table as you suggested, so that all x and y 
values are in the range 0.0 - 1.0, but I didn't change the loop to 
calculate those accesses, so it still queried for boxes in the range 0 - 
10. That makes me wonder, why does it need 220 accesses on average 
to satisfy queries most of which lie completely outside the range of 
actual values in the index? I would expect such queries to just look at 
the root node, conclude that there can't be any matching tuples, and 
return immediately.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-01 Thread Robert Haas
2011/9/1 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Peter Eisentraut pete...@gmx.net writes:
 Well, we could make initdb patch it up, but that might seem excessive.

 I sometime wonder if archive_mode shouldn't default to on with the
 archive_command set to either '/bin/true' or 'rem' for windows.

 That allows to install proper archiving without restart, but the
 tradeoff is of course that you need to restart to enable some
 optimisation cases by turning archive_mode off.

Seems like it would be better to fix archive_mode so that it can be
changed without a restart.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST

2011-09-01 Thread Oleg Bartunov
This is updates SP-GiST patch, which fixed one bug and 
replaced test to the locale independent one.


On Wed, 31 Aug 2011, Oleg Bartunov wrote:


Hi there,

attached is our WIP-patch for 9.2 development source tree, which provides
implementation of SP-GiST (prototype was presented at PGCon-2011, see
http://www.pgcon.org/2011/schedule/events/309.en.html and presentation
for details) as a core feature.  Main differences from prototype version:

1. Now it's part of pg core, not contrib module
2. It provides more operations for quadtree and suffix tree
3. It uses clustering algorithm of nodes on disk and has much better
utilization of disk space. Fillfactor is supported
4. Some corner cases were eliminated
5. It provides support for concurency and recovery (inserts are
logged, supports for deletes, and log replay will be added really
soon)

So, now code contains almost all possible overhead of production code
and we ask hackers to test performance on real data sets. We expect
the same performance for random data (since almost no overlaps) and
much better performance on real-life data, plus much better index
creation time. Also, we appreciate your comments and suggestions about
API.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

spgist_patch-0.85.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST

2011-09-01 Thread Alexander Korotkov
Hi!

Ie expect some problems in support of comparison operators for text, because
locale string comparison can have unexpected behaviour.
Let's see the example. Create table with words and add extra leading space
to some of them.

test=# create table dict(id serial, word text);
NOTICE:  CREATE TABLE will create implicit sequence dict_id_seq for serial
column dict.id
CREATE TABLE
test=# \copy dict(word) from '/usr/share/dict/american-english';
test=# update dict set word = ' '||word where id%2=0;
UPDATE 49284

I use Ubuntu 11.04 with ru_RU.utf8 locale. So, comparison operators ignores
leading spaces.

test=# select * from dict where word between 'cart' and 'cary';
  id   |  word
---+
  3029 | Carter
  3031 | Cartesian
  3033 | Carthage's
  3035 | Cartier
  3037 | Cartwright
  3039 | Caruso
  3041 | Carver
 28419 | cart
 28421 | carted
 28423 | cartel's
 28425 | cartilage
 28427 | cartilages
 28429 | carting
 28431 | cartographer's
 28433 | cartography
 28435 | carton
 28437 | cartons
 28439 | cartoon's
 28441 | cartooning
 28443 | cartoonist's
 28445 | cartoons
 28447 | cartridge's
 28449 | carts
 28451 | cartwheel's
 28453 | cartwheeling
 28455 | carve
 28457 | carver
 28459 | carvers
 28461 | carving
 28463 | carvings
  3030 |  Carter's
  3032 |  Carthage
  3034 |  Carthaginian
  3036 |  Cartier's
  3038 |  Cartwright's
  3040 |  Caruso's
  3042 |  Carver's
 28420 |  cart's
 28422 |  cartel
 28424 |  cartels
 28426 |  cartilage's
 28428 |  cartilaginous
 28430 |  cartographer
 28432 |  cartographers
 28434 |  cartography's
 28436 |  carton's
 28438 |  cartoon
 28440 |  cartooned
 28442 |  cartoonist
 28444 |  cartoonists
 28446 |  cartridge
 28448 |  cartridges
 28450 |  cartwheel
 28452 |  cartwheeled
 28454 |  cartwheels
 28456 |  carved
 28458 |  carver's
 28460 |  carves
 28462 |  carving's
(59 rows)

But if I create spgist index query result differs.

test=# create index dict_idx on dict using spgist (word);
CREATE INDEX
test=# select * from dict where word between 'cart' and 'cary';
  id   |  word
---+
 28419 | cart
 28421 | carted
 28423 | cartel's
 28425 | cartilage
 28427 | cartilages
 28429 | carting
 28431 | cartographer's
 28433 | cartography
 28435 | carton
 28437 | cartons
 28439 | cartoon's
 28441 | cartooning
 28443 | cartoonist's
 28445 | cartoons
 28447 | cartridge's
 28449 | carts
 28451 | cartwheel's
 28453 | cartwheeling
 28455 | carve
 28457 | carver
 28459 | carvers
 28461 | carving
 28463 | carvings
(23 rows)

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-09-01 Thread Robert Haas
On Fri, Aug 26, 2011 at 5:32 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 Yes. It also caches an expected security label when a client being
 labeled as scontext tries to execute a procedure being labeled as
 tcontext, to reduce number of system call invocations on fmgr_hook
 and needs_fmgr_hook.
 If the expected security label is not same with scontext, it means
 the procedure performs as a trusted procedure that switches security
 label of the client during its execution; like a security invoker
 function.
 A pair of security labels are the only factor to determine whether the
 procedure is a trusted-procedure, or not. Thus, it is suitable to
 cache in userspace avc.

I've committed this, but I still think it would be helpful to revise
that comment.  The turn boosted up is not very precise or
informative.  Could you submit a separate, comment-only patch to
improve this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink make fails under postgresql 8.4.4 on mac osx 10.4.11

2011-09-01 Thread Robert Haas
On Wed, Aug 31, 2011 at 9:45 AM, Gary Merkel garymer...@accugenix.com wrote:
 Having trouble installing dblink under PostgreSQL 8.4.4 on MAC OS X 10.4.11

 Running make gives the following error:

 sed 's,MODULE_PATHNAME,$libdir/dblink,g' dblink.sql.in dblink.sql

 gcc -no-cpp-precomp -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv
 -I../../src/interfaces/libpq -I. -I../../src/include   -c -o dblink.o
 dblink.c
 dblink.c: In function 'get_pkey_attnames':
 dblink.c:1698: error: 'SnapshotNow' undeclared (first use in this function)
 dblink.c:1698: error: (Each undeclared identifier is reported only once
 dblink.c:1698: error: for each function it appears in.)
 make: *** [dblink.o] Error 1

 Does anyone know a fix to this?

This is hard to understand, because SnapshotNow is defined in
utils/tqual.h, and dblink.c includes that file.

Also, I checked out the official sources at tag REL8_4_4 and it looks
to me like there's no reference to SnapshotNow on line 1698.  That
line for me is blank, and the only reference I see to SnapshotNow is
on line 1715.

So I suspect that what you've got is a modified version of the sources
that is somehow messed up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-09-01 Thread Kohei Kaigai
 On Fri, Aug 26, 2011 at 5:32 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
  Yes. It also caches an expected security label when a client being
  labeled as scontext tries to execute a procedure being labeled as
  tcontext, to reduce number of system call invocations on fmgr_hook
  and needs_fmgr_hook.
  If the expected security label is not same with scontext, it means
  the procedure performs as a trusted procedure that switches security
  label of the client during its execution; like a security invoker
  function.
  A pair of security labels are the only factor to determine whether the
  procedure is a trusted-procedure, or not. Thus, it is suitable to
  cache in userspace avc.
 
 I've committed this, but I still think it would be helpful to revise
 that comment.  The turn boosted up is not very precise or
 informative.  Could you submit a separate, comment-only patch to
 improve this?
 
OK, Please wait for a few days.

Thanks,
--
NEC Europe Ltd, SAP Global Competence Center
KaiGai Kohei kohei.kai...@emea.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mb_regress.sh gripes

2011-09-01 Thread Robert Haas
On Thu, Aug 18, 2011 at 6:19 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 A few gripes about mb_regress.sh:
  1. No exit code is specified, so even if there are differences
 between results/ and expected/ the script will still return 0.

  2. The 'dropdb' command is used to wipe out the utf8 database
 before the run. This generates an error message like:
  dropdb: database removal failed: ERROR:  database utf8 does not exist

 the first time you run the script. IMO it would be less startling to
 just print a NOTICE here.

  3. No error checking for whether createdb succeeds.

 The attached patch fixes these problems.

Committed, with some changes.  I used the new --if-exists option for
dropdb rather than doing it as you had it here; I assume this may have
been the motivation for that patch.  I also just made the exit code 1
no matter how many failures there were.  That seems more normal, and I
wasn't altogether certain that $((expr)) is completely portable.  I
also set the execute bit on the script.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-09-01 Thread Bruce Momjian
Bruce Momjian wrote:
 Bruce Momjian wrote:
  Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
It has been years since I ran src/tools/pginclude/pgrminclude to remove
unnecessary include files.  (I have already fixed things so include
files can be compiled on their own.)
   
The attached patch removes unneeded include references, and marks some
includes as needing to be skipped by pgrminclude.
   
I am sure applying this patch will break builds on some platforms and
some option combinations so I will monitor the buildfarm when I apply it
and make adjustments.
   
   The last time you did this was in July 2006.  It took us two weeks to
   mostly recover, but we were still dealing with some fallout in December,
   cf
   http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php
   
   We had the buildfarm then, had had it for a couple years.  The notion
   that watching the buildfarm is enough is fully disproven by history.
   
   Unless you have a better test plan than last time (which this isn't),
   I don't think this should be done at all.  The benefits are microscopic
   and the pain real.
  
  I don't have a better plan.  There are #ifdef code blocks that often
  don't get processed and therefore this can't be done better.  I will
  abandon the idea.
 
 OK, try #2.  I already had code that removed #if/#else/#endif code in
 *.h files for better testing, so I extended that to all *.c files.  This
 reduces the size of the diff from 6.6k lines to 4.7k lines but it makes
 it much less likely that there will be problems from running
 pgrminclude.
 
 The current patch is here:
 
   http://momjian.us/expire/pgrminclude.diff
 
 I tested the patch on BSD and Linux.

I have re-run the script and applied the result, again tested on BSD and
Linux.   I will monitor the buildfarm for possible failures.

This is not something we are going to do regularly, but probably every
five years like this time.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-09-01 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011:
 Bruce Momjian wrote:

  OK, try #2.  I already had code that removed #if/#else/#endif code in
  *.h files for better testing, so I extended that to all *.c files.  This
  reduces the size of the diff from 6.6k lines to 4.7k lines but it makes
  it much less likely that there will be problems from running
  pgrminclude.
  
  The current patch is here:
  
  http://momjian.us/expire/pgrminclude.diff
  
  I tested the patch on BSD and Linux.
 
 I have re-run the script and applied the result, again tested on BSD and
 Linux.   I will monitor the buildfarm for possible failures.

I think anything of this sort should be tested on Windows too.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-09-01 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011:
 I have re-run the script and applied the result, again tested on BSD and
 Linux.   I will monitor the buildfarm for possible failures.

 I think anything of this sort should be tested on Windows too.

Well, if Windows is broken we'll find out soon enough from the
buildfarm.  My recollection from the last go-round is that the pain
points were in non-default #define options that Bruce hadn't tested
and that no buildfarm critter exercised either, such as LOCK_DEBUG.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-09-01 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011:
  Bruce Momjian wrote:
 
   OK, try #2.  I already had code that removed #if/#else/#endif code in
   *.h files for better testing, so I extended that to all *.c files.  This
   reduces the size of the diff from 6.6k lines to 4.7k lines but it makes
   it much less likely that there will be problems from running
   pgrminclude.
   
   The current patch is here:
   
   http://momjian.us/expire/pgrminclude.diff
   
   I tested the patch on BSD and Linux.
  
  I have re-run the script and applied the result, again tested on BSD and
  Linux.   I will monitor the buildfarm for possible failures.
 
 I think anything of this sort should be tested on Windows too.

Agreed, but we have so many Windows configurations I figured I would let
the buildfarm test them, no?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-09-01 Thread Bruce Momjian
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Bruce Momjian's message of jue sep 01 11:04:33 -0300 2011:
  I have re-run the script and applied the result, again tested on BSD and
  Linux.   I will monitor the buildfarm for possible failures.
 
  I think anything of this sort should be tested on Windows too.
 
 Well, if Windows is broken we'll find out soon enough from the
 buildfarm.  My recollection from the last go-round is that the pain
 points were in non-default #define options that Bruce hadn't tested
 and that no buildfarm critter exercised either, such as LOCK_DEBUG.

Ah, but this time I only removed includes for files I could compile with
all #if markers removed.  It only got 5.8k diff lines out a possible
6.8k lines, but this seems like an acceptable cost for greater
reliability.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-01 Thread Hannu Krosing
Hallow hackers

I have the following problem to solve and would like to get advice on
the best way to do it.

The problem:

When growing a pl/proxy based database cluster, one of the main
operations is splitting a partition. The standard flow is as follows:

1) make a copy of the partitions table(s) to another database
2) reconfigure pl/proxy to use 2 partitions instead of one

The easy part is making a copy of all or half of the table to another
database. The hard part is fast deletion (i mean milliseconds,
comparable to TRUNCATE) the data that should not be in a partition (so
that RUN ON ALL functions will continue to return right results).

It would be relatively easy, if we still had the RULES for select
available for plain tables, but even then the eventual cleanup would
usually mean at least 3 passes of disk writes (set xmax, write deleted
flag, vacuum and remove)

What I would like to have is possibility for additional visibility
checks, which would run some simple C function over tuple data (usually
hash(fieldval) + and + or ) and return visibility (is in this partition)
as a result. It would be best if this is run at so low level that also
vacuum would use it and can clean up the foreign partition data in one
pass, without doing the delete dance first.

So finally the QUESTION :

where in code would be the best place to check for this so that

1) both regular queries and VACUUM see it
2) the tuple data (and not only system fields or just xmin/xmax) would
be available for the function to use


-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Informix FDW - anybody working on this?

2011-09-01 Thread Robert Treat
On Wed, Aug 31, 2011 at 8:13 AM, Bernd Helmle maili...@oopsware.de wrote:
 Out of curiosity,

 is anybody working on $subject? I'm currently planning to work on such a
 driver,
 but given the current stream of new drivers i want to make sure to not
 duplicate any efforts...


The most complete list I've seen of FDW's is on the wiki:
http://wiki.postgresql.org/wiki/Foreign_data_wrappers

Note there is an ODBC FDW, which might work, but if you start working
on an Informix specific one, please add it there. (I say this, knowing
that one of my co-workers has the outlines of a riak fdw he hasn't
listed yet... guess I should go pester him).


Robert Treat
conjecture: xzilla.net
consulting: omniti.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures

2011-09-01 Thread Josh Berkus
On 8/31/11 12:15 PM, Merlin Moncure wrote:
 An out of process, autonomous transaction type implementation should
 probably not sit under stored procedures for a number of reasons --
 mainly that it's going to expose too many implementation details to
 the user.  For example, does a SP heavy app have 2*N running
 processes?  Or do we slot them into a defined number of backends for
 that purpose? Yuck  yuck.  I like the AT feature, and kludge it
 frequently via dblink, but it's a solution for a different set of
 problems.

I think that transaction control without parallelism would be the 80%
solution.  That is, an SP has transaction control, but those
transactions are strictly serial, and cannot be run in parallel.  For
example, if you were writing an SP in PL/pgSQL, each BEGIN ... END
block would be an explicit transaction, and standalone-only statements
be allowed between BEGIN ... END blocks, or possibly in their own
special block type (I prefer the latter).

One issue we'd need to deal with is exception control around
single-statement transactions and non-transactional statements (VACUUM,
CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
is going to want to catch exceptions and abort the SP, and in other
cases ignore them, so both need to be possible.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

2011-09-01 Thread Tom Lane
I wrote:
 What *does* seem feasible is to back-port just the single change we
 actually need, by copying the two relevant macros into one of our
 config/ source files for the configure script.  I've tested that in
 8.3 and it seems to work --- at least, the generated configure script
 changes in the expected way.  This also seems like a reasonably sane
 thing to back-port to 8.2.  So I'll go ahead and commit those things
 and see if anchovy likes it.

So the upshot is that that fixed the 8.3 build, but anchovy is still
failing on 8.2, with some different errors:

/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function 
base_yyparse:y.tab.c:12777: error: undefined reference to 'filtered_base_yylex'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function 
base_yyparse:gram.y:494: error: undefined reference to 'parsetree'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans7.ltrans.o: in function 
parseTypeString:parse_type.c:445: error: undefined reference to 'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function 
simplify_function.128434.2836:postgres.c:544: error: undefined reference to 
'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function 
pg_parse_and_rewrite:postgres.c:544: error: undefined reference to 'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function 
fmgr_sql_validator:postgres.c:544: error: undefined reference to 'raw_parser'
collect2: ld returned 1 exit status

I went so far as to install Arch Linux here, but I cannot duplicate
the above.  (Although I wonder whether my machine is really doing
link-time optimization, since it doesn't generate any compiler warning
messages during the link step, as anchovy is doing.)

But these errors seem like they should be impossible anyway,
since there is nothing platform-specific about our uses of any of the
mentioned functions.  I wonder if there is something messed up with
anchovy's copy of REL8_2_STABLE.  Marti, could I trouble you to
blow away and recreate that machine's 8.2 checkout, as well as any
compiler cache directories?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedures

2011-09-01 Thread Christopher Browne
On Thu, Sep 1, 2011 at 1:18 PM, Josh Berkus j...@agliodbs.com wrote:
 On 8/31/11 12:15 PM, Merlin Moncure wrote:
 An out of process, autonomous transaction type implementation should
 probably not sit under stored procedures for a number of reasons --
 mainly that it's going to expose too many implementation details to
 the user.  For example, does a SP heavy app have 2*N running
 processes?  Or do we slot them into a defined number of backends for
 that purpose? Yuck  yuck.  I like the AT feature, and kludge it
 frequently via dblink, but it's a solution for a different set of
 problems.

 I think that transaction control without parallelism would be the 80%
 solution.  That is, an SP has transaction control, but those
 transactions are strictly serial, and cannot be run in parallel.  For
 example, if you were writing an SP in PL/pgSQL, each BEGIN ... END
 block would be an explicit transaction, and standalone-only statements
 be allowed between BEGIN ... END blocks, or possibly in their own
 special block type (I prefer the latter).

 One issue we'd need to deal with is exception control around
 single-statement transactions and non-transactional statements (VACUUM,
 CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
 is going to want to catch exceptions and abort the SP, and in other
 cases ignore them, so both need to be possible.

Yep, +1 on that.

Leaving out parallelism, and having the mechanism operate under the
auspices of a single connection, makes a fine start, and perhaps is
enough even in the longer run.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-01 Thread Josh Berkus

 Seems like it would be better to fix archive_mode so that it can be
 changed without a restart.

+1

I'm also wondering if providing some shell script examples of a
fault-tolerant script to handle archiving would be useful.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.conf archive_command example

2011-09-01 Thread Robert Haas
On Thu, Sep 1, 2011 at 3:05 PM, Josh Berkus j...@agliodbs.com wrote:
 +1

 I'm also wondering if providing some shell script examples of a
 fault-tolerant script to handle archiving would be useful.

I think it would.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-01 Thread Bruce Momjian
Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  +# contrib/pg_upgrade/test.sh
  +#
  +# Test driver for pg_upgrade.  Initializes a new database cluster,
  +# runs the regression tests (to put in some data), runs pg_dumpall,
  +# runs pg_upgrade, runs pg_dumpall again, compares the dumps.
 
 Hm .. my experience is that that doesn't work at all, because the
 regression tests set up assorted C functions whose implementations are
 in pg_regress.so, and it creates them with absolute path references
 to pg_regress.so.  When you try to load that into another installation
 that's a different version of PG, it quite properly fails.  So I think
 that as given, this script is only useful for testing pg_upgrade of
 $currentversion to $currentversion.  Which is surely better than no test

Reminder --- you can't use pg_upgrade to go from the same catalog
version to the same catalog version because the catalog version is
embedded in the tablespace directory name.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-01 Thread Bruce Momjian
Peter Eisentraut wrote:
 On tis, 2011-08-30 at 16:25 -0400, Tom Lane wrote:
  So I think that as given, this script is only useful for testing
  pg_upgrade of $currentversion to $currentversion.  Which is surely
  better than no test at all, but it would not for example have caught
  the 8.3 incompatibility that was just reported.
 
 Well, the goal was always current to current version.  Cross-version
 testing is obviously important, but will be quite a bit harder.
 
  How can we improve things here?  I've toyed with the idea of
  installing pg_regress.so so that we can refer to it relative to
  $libdir, but that might be a bit invasive, especially if we were to
  try to back-patch it as far as 8.3. 
 
 Aside from hesitations to backpatch those sorts of changes, it would
 effectively prevent us from ever removing anything from the C libraries
 used in the regression tests, because we need to keep the symbols around
 so that the schema dump can load successfully into the new instance.
 
 I think a solution would have to be one of:
 
 1) pg_upgrade needs a mode to cope with these situations.  It can tell
 the user, I upgraded your installation, but some dynamic modules appear
 to be missing, you need to sort that out before you can put this back
 into use.
 
 2) Design a different test schema to load into the database before
 running pg_upgrade.  This would then be a one-line change in the script.

Here are the scripts I use for testing:

http://momjian.us/expire/pg_upgrade_test.tgz

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sha1, sha2 functions into core?

2011-09-01 Thread Daniel Farina
On Wed, Aug 31, 2011 at 11:12 AM, Ross J. Reedstrom reeds...@rice.edu wrote:
 Hmm, this thread seems to have petered out without a conclusion. Just
 wanted to comment that there _are_ non-password storage uses for these
 digests: I use them in a context of storing large files in a bytea
 column, as a means to doing data deduplication, and avoiding pushing
 files from clients to server and back.

Yes, agreed: there is no decent content-addressing type in PostgreSQL,
so one rolls their own using shas and joins; I've seen this more than
once. It's a useful way to get non-bloated index on a series of
(larger than sha1) values where one only cares about the equality
operator (hash indexes, as unattractive as they were before in
PostgreSQL's implementation are even less so now with streaming
replication).

When that content to be addressed can be submitted from another
source, anything with md5 is correctly met with suspicion. We have
gone to the trouble of using pgcrypto to get sha1 access, but I know
of other applications that would have preferred to use sha but settle
for md5 simply because it's known to be bundled in core everywhere.

CREATE EXTENSION -- particularly if there is *any* way (is there? even
with ugliness like utility statement hooks) to configure it on the
provider end to not require superuser for common extensions like
'pgcrypto' -- could ablate this issue and one could get off the hash
treadmill, including md5 -- but I think that would be a mistake.
Applications need a high quality digest to enable any kind of
principled content addressing use case, and I think making that any
harder than a builtin is going to negatively impact the state of
things at large.  As a compromise, I'd also be happy with making
CREATE EXTENSION so trivial that everyone who has that use case can
get pgcrypto on any hosting provider.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-01 Thread Tomas Vondra
I've prepared a significantly simplified version of the patch. The two
main changes are

(a) it does not update the pg_stat_bgwriter anymore, it just prints an
info to the server log

(b) a new GUC is not required, it's driven by the log_checkpoints

This version will log at least 10 'checkpoint status' lines (at 10%, 20%,
30%, ...) and whenever 5 seconds since the last log elapses. The time is
not checked for each buffer but for 128 buffers.

So if the checkpoint is very slow, you'll get a message every 5 seconds,
if it's fast you'll get 10 messages.

Tomasdiff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
new file mode 100644
index 4c7cfb0..e60d304
*** a/src/backend/storage/buffer/bufmgr.c
--- b/src/backend/storage/buffer/bufmgr.c
*** BufferSync(int flags)
*** 1175,1180 
--- 1175,1193 
int num_to_write;
int num_written;
int mask = BM_DIRTY;
+   
+   int num_since_update;
+   
+   longcurr_secs,
+   total_secs;
+   int curr_usecs,
+   total_usecs;
+   float   curr_time,
+   total_time;
+   
+   TimestampTz startTimestamp, lastTimestamp;
+   
+   int log_interval, check_interval;
  
/* Make sure we can handle the pin inside SyncOneBuffer */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
*** BufferSync(int flags)
*** 1238,1243 
--- 1251,1265 
buf_id = StrategySyncStart(NULL, NULL);
num_to_scan = NBuffers;
num_written = 0;
+   num_since_update = 0;
+   
+   startTimestamp = GetCurrentTimestamp();
+   lastTimestamp = startTimestamp;
+   
+   /* check the progress each  128 buffers or 10% */
+   log_interval = (int)round(num_to_write/10);
+   check_interval = ((log_interval  128)  (log_interval  0)) ? 
log_interval : 128;
+   
while (num_to_scan--  0)
{
volatile BufferDesc *bufHdr = BufferDescriptors[buf_id];
*** BufferSync(int flags)
*** 1261,1266 
--- 1283,1334 
TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
BgWriterStats.m_buf_written_checkpoints++;
num_written++;
+   num_since_update++;
+   
+   /*
+* Every time we write enough buffers 
(checkpoint_update_limit),
+* we log a checkpoint status message and 
update the bgwriter
+* stats (so that the pg_stat_bgwriter table 
may be updated).
+* 
+* The log message contains info about total 
number of buffers to
+* write, how many buffers are already written, 
average and current
+* write speed and an estimate remaining time.
+*/
+   if ((log_checkpoints)  (num_written % 
check_interval == 0)) {
+ 
+   TimestampDifference(lastTimestamp,
+   GetCurrentTimestamp(),
+   curr_secs, curr_usecs);
+   
+   /* if at least 5 seconds elapsed since 
the last log, log */
+   if ((curr_secs = 5) || 
(num_since_update = log_interval)) {
+   
+   
TimestampDifference(startTimestamp,
+   GetCurrentTimestamp(),
+   total_secs, 
total_usecs);
+   
+   curr_time = curr_secs + 
(float)curr_usecs / 100;
+   total_time = total_secs + 
(float)total_usecs / 100;
+   
+   elog(LOG, checkpoint status: 
wrote %d buffers of %d (%.1f%%) in %.1f s; 
+   average %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   current %.1f MB/s (%d 
buffers, %ld.%03d s), 
+   remaining %.1f s,
+   num_written, 
num_to_write, ((float) num_written * 100 / num_to_write),
+   

Re: [HACKERS] strange row number estimates in pg9.1rc1

2011-09-01 Thread Bruce Momjian
Tom Lane wrote:
 I wrote:
  Sergey E. Koposov m...@sai.msu.ru writes:
  I'm seeing something weird which looks like a bug in 9.1rc1 after the 
  upgrade 8.4-9.0-9.1 done using pg_upgrade.
 
  Hm, I wonder what pg_upgrade left relpages/reltuples set to ...
 
 Sure enough, that's the problem.  pg_upgrade leaves relpages/reltuples
 set to zero, but it also imports the visibility map pages from the old
 cluster.  If the old visibility map shows the table as all-visible,
 then this happens when you try to VACUUM ANALYZE the table:
 
 1. VACUUM doesn't process any pages, so it has no tuple density
 estimate.  It leaves reltuples set to zero, but it does set relpages.
 
 2. ANALYZE scans some part of the table.  It gets a tuple density
 estimate for those pages ... but if that's only a small fraction of
 the table, it believes the zero estimate of tuple density elsewhere.
 So you get only a small update of reltuples.
 
 (The above behavior is new as of commit
 b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8, BTW.)
 
 Basically, step 1 is buggy here: if we aren't making an update to
 reltuples, we shouldn't set relpages either.  Setting it nonzero
 changes the implied tuple density from unknown to known zero,
 which is wrong.
 
 I'll go fix that, but I think it might be a good idea for pg_upgrade
 to think about preserving the relpages/reltuples columns ...

pg_upgrade currently only restores some oids and frozenxids.  We would
need to modify pg_dump --binary-upgrade mode to restore those values ---
it isn't hard to do.

 PS: right now, you cannot reproduce this in a 9.0 - HEAD upgrade,
 because of this patch:
 
 commit 00a7c9014a8fbb7388a807daeba3e0a85b49a747
 Author: Bruce Momjian br...@momjian.us
 Date:   Fri Aug 19 11:20:30 2011 -0400
 
 In pg_upgrade, don't copy visibility map files from clusters that did not
 have crash-safe visibility maps to clusters that expect crash-safety.
 
 Request from Robert Haas.
 
 I did reproduce it in a 9.0-9.1 test.

Right, that is expected.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-09-01 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 It's off topic. But I think custom format would require a major mangling 
 to be able to handle a complete cluster. This isn't just a simple matter 
 of programming, IMNSHO.

Oh, I meant just having it create separate custom format files for each 
database. As shell scripts all over the world have been doing for years, 
but it would be nice if it was simply built in.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201109012139
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk5gM+oACgkQvJuQZxSWSsi+xgCfbr0q+Ilbw0JRsORLZN2pSz1r
JtcAoJaleZvW/wWtU83d9MVeOes4I6+0
=VqFQ
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Getting the OID inside the executor

2011-09-01 Thread Fazool
Hello everyone,

I am implementing some functionality into Postgresql, where I want to
track which row was accessed by a user query. I am implementing the
functionality inside Postgres, so that there are no changes required
on client side (e.g. re-writing queries).
Rows are identified by OIDs, and I have set default_with_oids=true, so
that all tables are created with OIDs.

Now, when I run a 'select * from my_table', I can easily get the OID
inside the executor because I get a 'HeapTuple' from ExecSan(). I call
HeapTupleGetOid() on the heap tuple and I have the oid.

The problem is that when I run a query with a projection, e.g.,
'select name from my_table', then the scanner returns a null
HeapTuple. How can I get the OID in such a case?

What would be the best way of doing this? Should I modify the planner
to add 'get oid' as a junk filter? Would it affect something else?

Any comments will be highly appreciated.

Thanks!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting the OID inside the executor

2011-09-01 Thread Tom Lane
Fazool fazoolm...@gmail.com writes:
 I am implementing some functionality into Postgresql, where I want to
 track which row was accessed by a user query. I am implementing the
 functionality inside Postgres, so that there are no changes required
 on client side (e.g. re-writing queries).
 Rows are identified by OIDs, and I have set default_with_oids=true, so
 that all tables are created with OIDs.

It's a serious, serious error to suppose that OIDs are adequate
identifiers for rows in user tables.

We use OIDs to identify rows in system catalogs.  We can get away with
that, more or less, because (a) system catalogs are not likely to
contain billions of rows, and (b) we place a unique index on OID on
every system catalog that has OIDs.  Neither of these statements is
very tenable for user tables.

You haven't said anything about what it is you actually need to
accomplish here, but can you use TIDs as row identifiers?  What's
the required lifetime of the identifiers?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting the OID inside the executor

2011-09-01 Thread Fazool
Thanks for the reply Tom.

As far as I know, the TID of a row can change over time (e.g. when the
row is updated). Isn't it so? If so, it wont solve my problem.

It says here (http://www.postgresql.org/docs/9.0/static/ddl-system-columns.html)
that
The OID, or even better a user-defined serial number, should be used
to identify logical rows., so I thought I can use OID. You are right
as OIDs are 32-bit ints, so they will wrap-around at some point.
The reason why I don't want to use a SERIAL user-column is that I want
the process to be transparent to the user.

What I want accomplished is the following. I want to trace all
accesses (read/update/insert) to rows. If a row was accessed twice, it
should appear twice in the trace. Later (offline), I want to analyze
the trace to find some patterns, and after deducing some results from
the pattern, I should be able to access the original rows in the
database. For example, if my pattern-analysis says that row x is
best, I should be able to read x from the DB. Hence, a tuple should
uniquely be identifiable (have same ID) in the DB and in the trace.

What would you suggest for such a scenario?

Thanks for the help.


On Thu, Sep 1, 2011 at 10:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fazool fazoolm...@gmail.com writes:
 I am implementing some functionality into Postgresql, where I want to
 track which row was accessed by a user query. I am implementing the
 functionality inside Postgres, so that there are no changes required
 on client side (e.g. re-writing queries).
 Rows are identified by OIDs, and I have set default_with_oids=true, so
 that all tables are created with OIDs.

 It's a serious, serious error to suppose that OIDs are adequate
 identifiers for rows in user tables.

 We use OIDs to identify rows in system catalogs.  We can get away with
 that, more or less, because (a) system catalogs are not likely to
 contain billions of rows, and (b) we place a unique index on OID on
 every system catalog that has OIDs.  Neither of these statements is
 very tenable for user tables.

 You haven't said anything about what it is you actually need to
 accomplish here, but can you use TIDs as row identifiers?  What's
 the required lifetime of the identifiers?

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers