Re: [HACKERS] machine-readable explain output v4

2009-08-13 Thread Greg Smith

On Tue, 11 Aug 2009, Mike wrote:


Have any tool authors stepped up and committed resources to utilizing
this feature in the near term?


Even before the easier to read format was available, there were already 
multiple EXPLAIN analysis tools floating around, some of them web-based 
like you're considering; a list is at 
http://wiki.postgresql.org/wiki/Using_EXPLAIN


You might expect some of those tool authors would do the appropriate 
overhaul to import the new format data, and perhaps make things more 
powerful or simple in the process.  You might want to collaborate within 
someone writing one of those existing applications rather than start over 
on your own.


The reason I would like to provide this tool in a web-based form is that 
no additional software installation would be necessary for the user, 
reducing any hurdles to using it to zero.


I personally hate only having a web-based tool for this style of 
application, because I'm always dealing with data I can't paste into 
somebody else's site for EXPLAIN output--that's a common hurdle that's 
impossible to clear given all the regulatory and business secret 
restrictions people work under nowadays.  Even when the source code is 
available for the web app, that puts you back to needing to install the 
tool locally, and I've found web apps tend to be more complicated to get 
running than a typical standalone app.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Greg Smith

On Tue, 11 Aug 2009, Dimitri Fontaine wrote:

We should somehow provide a default archive and restore command integrated 
into the main product, so that it's as easy as turning it 'on' in the 
configuration for users to have something trustworthy: PostgreSQL will keep 
past logs into a pg_xlog/archives subdir or some other default place, and 
will know about the setup at startup time when/if needed.


Wandering a little off topic here because this plan reminded me of 
something else I've been meaning to improve...while most use-cases require 
some sort of network transport for this to be useful, there is one obvious 
situation where it would be great to have a ready to roll setup by 
default.  Right now, if people want to make a filesystem level background 
of their database, they first have to grapple with setting up the archive 
command to do so.  If the system were shipped in a way that made that 
trivial to active, perhaps using something like what you describe here, 
that would reduce the complaints that PostgreSQL doesn't have any easy way 
to grab a filesystem hotcopy of the database. Those rightly pop up 
sometimes, and it would be great if the procedure were reduced to:


1) Enable archiving
2) pg_start_backup
3) rsync/tar/cpio/copy/etc.
4) pg_stop_backup
5) Disable archiving

Because the default archive_command was something that supported a 
filesystem snapshot using a standard layout.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Geometry RESTRICT and JOIN

2009-08-13 Thread Paul Matthews
I'm trying to add all the box op point operators. The C routines are
written and working as advertised. The manuals description of the
RESTRICT and JOIN clauses of CREATE OPERATOR don't seem too clear. Are
these samples correct, or am I totally off base here?

CREATE OPERATOR  (
  LEFTARG= box,
  RIGHTARG   = point,
  PROCEDURE  = leftof,
  RESTRICT   = scalarltsel, -- ?? UNSURE
  JOIN   = positionjoinsel  -- ?? UNCLEAR
);

CREATE OPERATOR  (
  LEFTARG= box,
  RIGHTARG   = point,
  PROCEDURE  = notleft,
  RESTRICT   = scalargtsel, -- ?? UNSURE
  JOIN   = positionjoinsel  -- ?? UNCLEAR
);

CREATE OPERATOR @ (
  LEFTARG= box,
  RIGHTARG   = point,
  PROCEDURE  = contains,
  RESTRICT   = eqsel,   -- ?? UNSURE
  JOIN   = contjoinsel  -- ?? UNCLEAR
);

...etc...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Dimitri Fontaine

Hi,

Josh Berkus j...@agliodbs.com writes:
 Will do.  Teaching myself RST now 

I've been doing a lot of RST editing before, and found it pretty
straightforward. Except for default table handling, where ascii-art
maintenance is a pain, or you have to use extended tools, like emacs
table mode and such. Or use list-table and rejoy :)

  http://docutils.sourceforge.net/docs/ref/rst/directives.html#list-table

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Massa, Harald Armin
within source code, build options there is:

- Reserve the shared memory region during backend startup on Windows,
  so that memory allocated by starting third party DLLs doesn't end up
  conflicting with it.  Hopefully this solves the long-time issue with
  could not reattach to shared memory errors on Win32.


I suggest that it should also be pointed out that this fix will be
backported to 8.3 and 8.4 (as much as I followed the ML); similiar to the
information at

- Fast shutdown stop should forcibly disconnect any active backends,
  even if a smart shutdown is already in progress. Backpatched to 8.3.


best wishes

Harald


-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
LASIK good, steroids bad?


Re: [HACKERS] COPY speedup

2009-08-13 Thread Pierre Frédéric Caillau d



But when I see a big red button, I just press it to see what happens.
Ugly hacks are useful to know how fast the thing can go ; then the
interesting part is to reimplement it cleanly, trying to reach the
same performance...


Right -- now that you've shown a 6x speedup increase, it is clear that
it makes sense to attempt a reimplementation.  It also means it makes
sense to have an additional pair or two of input/output functions.


Okay.

Here are some numbers. The tables are the same as in the previous email,
and it also contains the same results as copy patch 4, aka API hack
for reference.

I benchmarked these :

* p5 = no api changes, COPY TO optimized :
- Optimizations in COPY (fast buffer, much less fwrite() calls, etc)
remain.
- SendFunction API reverted to original state (actually, the API changes
are still there, but deactivated, fcinfo-context = NULL).

= small performance gain ; of course the lower per-row overhead is more
visible on test_one_int, because that table has 1 column.
= the (still huge) distance between p5 and API hack is split between
overhead in pq_send*+stringInfo (that we will tackle below) and palloc()
overhead (that was removed by the API hack by passing the destination
buffer directly).

* p6 = p5 + optimization of pq_send*
- inlining strategic functions
- probably benefits many other code paths

= small incremental performance gain

* p7 = p6 + optimization of StringInfo
- inlining strategic functions
- probably benefits many other code paths

= small incremental performance gain (they start to add up nicely)

* p8 = p7 + optimization of palloc()
- actually this is extremely dumb :
- int4send and int2send simply palloc() 16 bytes instead of 1024..
- the initial size of the allocset is 64K instead of 8K

= still it has interesting results...

The three patches above are quite simple (especially the inlines) and yet,
speedup is already nice.

* p9 = p8 + monstrously ugly hack
copy looks at the sendfunc, notices it's int{2,4}send , and replaces it
with int{2,4}fastsend which is called directly from C, bypassing the fmgr
(urrrgghh)
of course it only works for ints.
This gives information about fmgr overhead : fmgr is pretty damn fast.

* p10 no copy
does everything except calling the SendFuncs, it writes dummy data instead.
This gives the time used in everything except the SendFuncs : table scan,
deform_tuple, file writes, etc, which is an interesting thing to know.

RESULTS :

COPY annonces TO '/dev/null' BINARY  :
  Time | Speedup |  Table |  KRows | MTuples | Name
   (s) | |   MB/s | /s |  /s |
--|-|||-|-
2.149 |  2.60 x | 151.57 | 192.40 |7.50 | copy to patch 4
3.055 |  1.83 x | 106.64 | 135.37 |5.28 | p8 = p7 +  optimization of  
palloc()
3.202 |  1.74 x | 101.74 | 129.15 |5.04 | p7 = p6 +  optimization of  
StringInfo
3.754 |  1.49 x |  86.78 | 110.15 |4.30 | p6 = p5 +  optimization of  
pq_send*
4.434 |  1.26 x |  73.47 |  93.26 |3.64 | p5 no api changes,  COPY TO  
optimized

5.579 | --- |  58.39 |  74.12 |2.89 | compiled from source

COPY archive_data TO '/dev/null' BINARY  :
  Time | Speedup | Table |  KRows | MTuples | Name
   (s) | |  MB/s | /s |  /s |
---|-|---||-|-
 5.372 |  3.75 x | 73.96 | 492.88 |   13.80 | copy to patch 4
 8.545 |  2.36 x | 46.49 | 309.83 |8.68 | p8 = p7 +  optimization of  
palloc()
10.229 |  1.97 x | 38.84 | 258.82 |7.25 | p7 = p6 +  optimization of  
StringInfo
12.869 |  1.57 x | 30.87 | 205.73 |5.76 | p6 = p5 +  optimization of  
pq_send*
15.559 |  1.30 x | 25.54 | 170.16 |4.76 | p5 no api changes,  COPY TO  
optimized

20.165 | --- | 19.70 | 131.29 |3.68 | 8.4.0 / compiled from source

COPY test_one_int TO '/dev/null' BINARY  :
 Time | Speedup |  Table |   KRows | MTuples | Name
  (s) | |   MB/s |  /s |  /s |
--|-||-|-|-
1.493 |  4.23 x | 205.25 | 6699.22 |6.70 | p10 no copy
1.660 |  3.80 x | 184.51 | 6022.33 |6.02 | p9 monstrously ugly  hack
2.003 |  3.15 x | 152.94 | 4991.87 |4.99 | copy to patch 4
2.803 |  2.25 x | 109.32 | 3568.03 |3.57 | p8 = p7 +  optimization of  
palloc()
2.976 |  2.12 x | 102.94 | 3360.05 |3.36 | p7 = p6 +  optimization of  
StringInfo
3.165 |  2.00 x |  96.82 | 3160.05 |3.16 | p6 = p5 +  optimization of  
pq_send*
3.698 |  1.71 x |  82.86 | 2704.43 |2.70 | p5 no api changes,  COPY TO  
optimized

6.318 | --- |  48.49 | 1582.85 |1.58 | 8.4.0 / compiled from source

COPY test_many_ints TO '/dev/null' BINARY  :
 Time | Speedup |  Table |  KRows | MTuples | Name
  (s) | |   MB/s | /s |  /s |
--|-|||-|-
1.007 |  8.80 x | 127.23 | 993.34 |   

[HACKERS] FDW-based dblink

2009-08-13 Thread Itagaki Takahiro
Here is a proposal to integrate contrib/dblink and SQL/MED (foreign
data wrapper).

Dblink manages connections and transactions by itself at the moment,
but there are some benefits to split database connectors into FDW.
Dblink will uses those multiple connectors. For example, we will be
able to retrieve data from Oracle into PostgreSQL directly if we had
Oracle-connector.

 New syntax in SQL
---
CREATE FOREIGN DATA WRAPPER postgres
CONNECTOR pg_catalog.dblink_postgres;
or
CREATE FOREIGN DATA WRAPPER postgres
OPTIONS (connector 'pg_catalog.dblink_postgres')

We don't have to modify gram.y if we take the latter syntax, but need
to modify VALIDATORs to distinguish 'connector' and other options.
The 'connector' option should not be passed as connection string.

 New interface in C

pg_catalog.dblink_postgres is a function that havs folloing prototype:

Connection *connector(List *defElems);

The argument 'defElems' is a concatenated connection options
in FDW, server, and user-mapping.

Also new two interfaces will be introduced:

interface Connection/* represents PGconn */
{
voiddisconnect(self);
Cursor *open(self, query, fetchsize); /* for SELECT */
int64   exec(self, query);/* for UPDATE, INSERT, DELETE */
booltransaction_command(self, type);
}

interface Cursor/* represents PGresult and server-side cursor */
{
bool fetch(self, OUT values);
void close(self);
}

They have some methods implemented with function pointers. The benefit
of using function pointers is that we only have to export one connector
function to pg_proc.

The Cursor interface represents both result-set and server-side cursor.
PostgreSQL has SQL-level cursor, but there are some database that have
protocol-level cursor. This abstraction layer is needed for dblink to
handle connectors to other databases.

 Other features

Present dblink is a thin wrapper of libpq, but some of my customers
want automatic transaction managements. Remote transactions are
committed with 2PC when the local transaction is committed.
To achieve it, I think we need on-commit trigger is needed,
but it is hard to implement with current infrastructure.
(That is one of the reason I proposed to merge dblink into core.)

 Other considerations
--
The proposed method is a SQL-based connector. There might be another
approach -- ScanKey-based connector. It is similar to the index access
method interface (pg_am). It takes relation id and scankeys instead of
SQL text. The scanKey-based approach will work better if we try to pass
WHERE-clause to an external database. However, I think we need SQL-based
interface in any case. ScanKey will be converted to SQL and passed
to an external database.


I have a prototype of the feature. I'd like to submit it for 8.5.
Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] trigger functions can only be called as triggers

2009-08-13 Thread Peter Eisentraut
Is there a reason why the function manager allows calling trigger functions 
outside of triggers and forces the PLs to catch this case themselves?  Is 
there a case where calling trigger functions directly is useful?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink

2009-08-13 Thread Heikki Linnakangas
Itagaki Takahiro wrote:
 Present dblink is a thin wrapper of libpq, but some of my customers
 want automatic transaction managements. Remote transactions are
 committed with 2PC when the local transaction is committed.
 To achieve it, I think we need on-commit trigger is needed,
 but it is hard to implement with current infrastructure.
 (That is one of the reason I proposed to merge dblink into core.)

Quite aside from the requirement for on-commit trigger, how exactly
would you use 2PC with the remote database? When would you issue PREPARE
TRANSACTION, and when would COMMIT PREPARED? What if the local database
crashes in between - is the remote transaction left hanging in prepared
state?

Making the remote transcation atomic with the local one is a lot harder
than it may seem at first glance. It's doable, but I think you'll need
to build a full-fledged transaction manager into dblink, or integrate
with a 3rd party one,

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DECLARE doesn't set/reset sqlca after DECLARE cursor

2009-08-13 Thread Michael Meskes
On Wed, Aug 12, 2009 at 07:13:44PM +0200, Boszormenyi Zoltan wrote:
 a customer of us complained a behavioural difference
 ...
 The attached patch implements this. The only downside
 is that now DECLARE CURSOR cannot appear outside
 of a function, a change in test/preproc/variable.pgc reflects

DECLARE by definition is a declarative command and as such should be able to
live outside a function.

Michael

-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ: 179140304, AIM/Yahoo/Skype: michaelmeskes, Jabber: mes...@jabber.org
Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] surprising trigger/foreign key interaction

2009-08-13 Thread Stefan Kaltenbrunner

Tom Lane wrote:

Alvaro Herrera alvhe...@commandprompt.com writes:

However I'm guessing that what actually happens is that heap_update is
returning HeapTupleSelfUpdated instead, which the code states as
/* nothing to do */.


Yeah.


I imagine this is so because of some old fiddling to get semantics just
right for obscure corner cases, but it feels wrong nevertheless.


I suspect it was reluctance to use the EvalPlanQual semantics (which
are pretty bogus in their own way) for perfectly deterministic
single-transaction cases.


still the current behaviour feels quite wrong because even after the 
update the modified tuple still satisfies the WHERE clause of the DELETE 
but still it won't actually get deleted.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] surprising trigger/foreign key interaction

2009-08-13 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:

  I imagine this is so because of some old fiddling to get semantics just
  right for obscure corner cases, but it feels wrong nevertheless.
 
 I suspect it was reluctance to use the EvalPlanQual semantics (which
 are pretty bogus in their own way) for perfectly deterministic
 single-transaction cases.

I suspect the FK trigger messing up the visibility is an obscure corner
case too :-(

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-08-13 Thread Robert Haas
On Tue, Aug 11, 2009 at 4:31 AM, Peter Eisentrautpete...@gmx.net wrote:
 On Tuesday 11 August 2009 08:28:24 Jaime Casanova wrote:
 try to build the docs to see how to properly test this and seems like
 you have to teach contrib.sgml and bookindex.sgml about
 dict-unaccent... and when i did that i got this:

 
 openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -c
 /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d
 stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
 openjade:dict-unaccent.sgml:48:1:E: non SGML character number 128
 openjade:dict-unaccent.sgml:49:1:E: non SGML character number 129
 openjade:dict-unaccent.sgml:50:1:E: non SGML character number 130
 openjade:dict-unaccent.sgml:51:1:E: non SGML character number 131
 openjade:dict-unaccent.sgml:52:1:E: non SGML character number 132
 openjade:dict-unaccent.sgml:53:1:E: non SGML character number 133
 openjade:dict-unaccent.sgml:54:1:E: non SGML character number 134
 openjade:dict-unaccent.sgml:116:4:E: element B undefined
 make: *** [HTML.index] Error 1
 make: *** Se borra el archivo `HTML.index'
 

 You should escape the special characters as well as the b that appears as
 part of the example output using character entitities (amp; etc.).

Sounds like this patch needs a little bit of doc adjustment per the
above and is then ready for committer?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Alvaro Herrera
Dimitri Fontaine wrote:
 
 Hi,
 
 Josh Berkus j...@agliodbs.com writes:
  Will do.  Teaching myself RST now 
 
 I've been doing a lot of RST editing before, and found it pretty
 straightforward. Except for default table handling, where ascii-art
 maintenance is a pain, or you have to use extended tools, like emacs
 table mode and such. Or use list-table and rejoy :)
 
   http://docutils.sourceforge.net/docs/ref/rst/directives.html#list-table

Yeah, table handling in RST is pretty silly, particularly when you have
to escape some character in a cell.

I wonder if this format can be converted to SGML DocBook automatically.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dependencies for generated header files

2009-08-13 Thread Robert Haas
On Tue, Aug 11, 2009 at 9:56 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Given that the anum.h stuff is gone, vastly might be an
 overstatement.  I'm pretty surprised to find out that people don't
 like the idea of having dependencies be correct from anywhere in the
 tree.  Even if I'm the only developer who does partial builds, the
 cost seems to me to be next to nil, so I'm not quite sure what anyone
 gets out of rejecting this patch.

 It's not that having the dependencies be 100% up to date wouldn't be
 nice; it's that there's a limit to how much we're willing to uglify
 the Makefiles to have that.  The makefiles need maintenance too,
 you know, and putting things far away from where they should be is
 not any better in the makefiles than it is in C code.

Well, I certainly agree that making a huge mess to address what is
admittedly a corner case is not a good idea.  But I also don't think
this patch is all that messy.  However, I guess we're getting to the
point where we need to make a decision one way or the other so that we
can close out this CommitFest.

 As far as I can tell, if you've used --enable-depend then things will
 get updated properly before you can ever attempt to run the code
 (ie, install a rebuilt postmaster).  The only situation where you'd
 actually get an improvement from redoing the dependencies like this
 is where lack of an update to a derived file results in a compiler
 error/warning.  But there aren't many such cases.  The only one I can
 even think of offhand is lack of an fmgroids.h symbol for a newly-added
 function ... but we don't use F_XXX symbols enough to make that a
 convincing example.  We've intentionally arranged things so that
 more-fragile cases like gram.h are not referenced outside their own
 directories.

Yes, that's definitely the best situation.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink

2009-08-13 Thread Alvaro Herrera
Itagaki Takahiro wrote:

 Also new two interfaces will be introduced:
 
 interface Connection/* represents PGconn */
 {
 voiddisconnect(self);
 Cursor *open(self, query, fetchsize); /* for SELECT */
 int64   exec(self, query);/* for UPDATE, INSERT, DELETE */
 booltransaction_command(self, type);
 }

It's not good to return int64 in exec(), because it could have a
RETURNING clause.  (So it also needs a fetchsize).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Peter Eisentraut
On Thursday 13 August 2009 17:07:38 Alvaro Herrera wrote:
 I wonder if this format can be converted to SGML DocBook automatically.

Yes, that's why I used it.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Tom Lane
Massa, Harald Armin c...@ghum.de writes:
 within source code, build options there is:
 - Reserve the shared memory region during backend startup on Windows,
   so that memory allocated by starting third party DLLs doesn't end up
   conflicting with it.  Hopefully this solves the long-time issue with
   could not reattach to shared memory errors on Win32.

 I suggest that it should also be pointed out that this fix will be
 backported to 8.3 and 8.4 (as much as I followed the ML);

Normally, bug fixes that have been back-patched wouldn't be mentioned at
all in a new major release's release notes.  The implied base that we
are comparing to in major-release notes is the end of the prior branch's
updates.  I'm not sure if this case should be an exception, or if we
should have a different general rule for alpha releases.  We'd like to
get more testing on that fix, so I think it is reasonable to mention it
for alpha1 --- but is that an exception specific to this bug fix, or
does it indicate we want to handle bug fixes differently in general
within alpha release notes?

In any case, it is not the function of the alpha release notes to
discuss changes in earlier release branches.  The reason the commit
log points out the back-patch is to make it easier to extract the
information when we prepare release notes for the back-branch updates.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trigger functions can only be called as triggers

2009-08-13 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Is there a reason why the function manager allows calling trigger functions 
 outside of triggers and forces the PLs to catch this case themselves?  Is 
 there a case where calling trigger functions directly is useful?

I think it's a matter of not wanting to slow down *all* function calls
with an error check that's useless for most.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] schemapg.h

2009-08-13 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 Indeed, and it fails to get rid of all the dull declarations :-(.

 Right.  I don't think we're going to move forward if we only accept
 giant steps at a time, and we simultaneously reject patches that are too
 intrusive.

I'm okay with small steps as long as they're small steps in the right
direction ;-).  I'm not convinced that this script is the right
direction.

 I thought the idea was to generate all this stuff directly from the C
 struct declarations (plus some hardwired knowledge about the
 datatypes, comparable to what is in TypInfo in bootstrap.c already).

 Hmm, perhaps that's workable.  I'll have a look around.

OK.  It might be interesting to see if this can be unified somehow with
what the bootstrap.c code does.  (However, since that runs at initdb
time not during compilation, there may not be any reasonable way to
unify the two.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-08-13 Thread Oleg Bartunov

Peter,

how to write accented characters in sgml ? Is't not allowed to write them 
as is ?


Oleg
On Tue, 11 Aug 2009, Peter Eisentraut wrote:


On Tuesday 11 August 2009 08:28:24 Jaime Casanova wrote:

try to build the docs to see how to properly test this and seems like
you have to teach contrib.sgml and bookindex.sgml about
dict-unaccent... and when i did that i got this:


openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -c
/usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d
stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
openjade:dict-unaccent.sgml:48:1:E: non SGML character number 128
openjade:dict-unaccent.sgml:49:1:E: non SGML character number 129
openjade:dict-unaccent.sgml:50:1:E: non SGML character number 130
openjade:dict-unaccent.sgml:51:1:E: non SGML character number 131
openjade:dict-unaccent.sgml:52:1:E: non SGML character number 132
openjade:dict-unaccent.sgml:53:1:E: non SGML character number 133
openjade:dict-unaccent.sgml:54:1:E: non SGML character number 134
openjade:dict-unaccent.sgml:116:4:E: element B undefined
make: *** [HTML.index] Error 1
make: *** Se borra el archivo `HTML.index'



You should escape the special characters as well as the b that appears as
part of the example output using character entitities (amp; etc.).



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Geometry RESTRICT and JOIN

2009-08-13 Thread Tom Lane
Paul Matthews p...@netspace.net.au writes:
 I'm trying to add all the box op point operators. The C routines are
 written and working as advertised. The manuals description of the
 RESTRICT and JOIN clauses of CREATE OPERATOR don't seem too clear. Are
 these samples correct, or am I totally off base here?

Well, I'm pretty sure you don't want the scalar selectivity functions
for any of these.  IIRC the geometric selectivity functions already
come in pairs, eg you should use contsel and contjoinsel for @.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Alvaro Herrera
Tom Lane wrote:

 In any case, it is not the function of the alpha release notes to
 discuss changes in earlier release branches.  The reason the commit
 log points out the back-patch is to make it easier to extract the
 information when we prepare release notes for the back-branch updates.

Hmm, isn't it enough to use cvs2cl --follow branch?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-08-13 Thread Alvaro Herrera
Oleg Bartunov wrote:
 Peter,
 
 how to write accented characters in sgml ? Is't not allowed to write
 them as is ?

aacute; for á, etc.  You can't use characters that aren't in Latin-1 I think.
Writing them literally is not allowed.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 In any case, it is not the function of the alpha release notes to
 discuss changes in earlier release branches.  The reason the commit
 log points out the back-patch is to make it easier to extract the
 information when we prepare release notes for the back-branch updates.

 Hmm, isn't it enough to use cvs2cl --follow branch?

Yeah, cvs will certainly tell you the same information, which is why
I frequently don't bother mentioning the point at all in commit
messages.  I think the most useful reason for mentioning the branch(es)
in a commit message is to explain why a particular patch goes back
so far and no farther.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 *scratches head*
 
 I don't really know how you COULD pick a safe default location.
 Presumably any location that's in the default postgresql.conf file
 would be under $PGDATA, which kind of defeats the purpose of the
 whole thing.  In other words, you're always going to have to move it
 anyway, so why bother with a default that is bound to be wrong?
 
Well, we want the WAL files to flow in two directions from the
database server so that if either target (or connectivity to it) is
down, the WAL files still flow to the other target.  The only sensible
way to do that, as far as we've determined, is to have the archive
script copy to a temporary directory and move to a publisher
directory, then have once-a-minute crontab jobs to rsync the directory
to the targets.
 
We figure that while a WAL file is not more at risk in the publisher
directory than in the pg_xlog directory on the same volume.
 
The other reason is what I think Greg Smith was mentioning --
simplifying the process of grabbing a usable PITR backup for novice
users.  That seems like it has merit.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DECLARE doesn't set/reset sqlca after DECLARE cursor

2009-08-13 Thread Boszormenyi Zoltan
Michael Meskes írta:
 On Wed, Aug 12, 2009 at 07:13:44PM +0200, Boszormenyi Zoltan wrote:
   
 a customer of us complained a behavioural difference
 ...
 The attached patch implements this. The only downside
 is that now DECLARE CURSOR cannot appear outside
 of a function, a change in test/preproc/variable.pgc reflects
 

 DECLARE by definition is a declarative command and as such should be able to
 live outside a function.
   

Okay, so it's a declarative command. But if we're in a function,
we should still emit a call to ecpg_init, to be able to follow
the Informix behaviour. We can limit it it compat mode, though.
The attached patch does this, and detects being inside of a function
by braces_open  0. Short of rewriting ECPG into a flull-fledged
C/C++ preprocessor, we can't do better currently.

In compat mode, you cannot do
DECLARE mycur CURSOR FOR SELECT ... INTO :var, ...
or
DECLARE mycur CURSOR FOR SELECT ... WHERE field = ?
in the global scope because adjust_informix() emits function calls
outside of a function. Or is this declaration illegal?
At least it should be documented in PostgreSQL.

Best regards,
Zoltán Böszörményi

-- 
Bible has answers for everything. Proof:
But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil. (Matthew 5:37) - basics of digital technology.
May your kingdom come - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
http://www.postgresql.at/

diff -dcrpN pgsql.describe/src/interfaces/ecpg/ecpglib/exports.txt pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/exports.txt
*** pgsql.describe/src/interfaces/ecpg/ecpglib/exports.txt	2008-03-25 13:58:49.0 +0100
--- pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/exports.txt	2009-08-12 18:13:11.0 +0200
*** ECPGstatus   23
*** 26,28 
--- 26,29 
  ECPGtrans24
  sqlprint 25
  ECPGget_PGconn			 26
+ ECPGreset_sqlca			 27
diff -dcrpN pgsql.describe/src/interfaces/ecpg/ecpglib/misc.c pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/misc.c
*** pgsql.describe/src/interfaces/ecpg/ecpglib/misc.c	2009-08-07 13:06:28.0 +0200
--- pgsql.declare-reset-sqlca/src/interfaces/ecpg/ecpglib/misc.c	2009-08-12 18:22:35.0 +0200
*** ecpg_gettext(const char *msgid)
*** 489,491 
--- 489,499 
  }
  
  #endif   /* ENABLE_NLS */
+ 
+ bool
+ ECPGreset_sqlca(int lineno, const char *connection_name)
+ {
+ 	struct connection *con = ecpg_get_connection(connection_name);
+ 
+ 	return ecpg_init(con, connection_name, lineno);
+ }
diff -dcrpN pgsql.describe/src/interfaces/ecpg/include/ecpglib.h pgsql.declare-reset-sqlca/src/interfaces/ecpg/include/ecpglib.h
*** pgsql.describe/src/interfaces/ecpg/include/ecpglib.h	2009-08-11 14:34:03.0 +0200
--- pgsql.declare-reset-sqlca/src/interfaces/ecpg/include/ecpglib.h	2009-08-12 18:21:06.0 +0200
*** bool		ECPGset_desc(int, const char *, in
*** 84,89 
--- 84,90 
  void		ECPGset_noind_null(enum ECPGttype, void *);
  bool		ECPGis_noind_null(enum ECPGttype, void *);
  bool		ECPGdescribe(int, bool, const char *, const char *, ...);
+ bool		ECPGreset_sqlca(int, const char *);
  
  /* dynamic result allocation */
  void		ECPGfree_auto_mem(void);
diff -dcrpN pgsql.describe/src/interfaces/ecpg/preproc/ecpg.addons pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.addons
*** pgsql.describe/src/interfaces/ecpg/preproc/ecpg.addons	2009-08-11 14:34:03.0 +0200
--- pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.addons	2009-08-13 17:40:41.0 +0200
*** ECPG: DeclareCursorStmtDECLAREcursor_nam
*** 318,324 
  		cur = this;
  
  		if (INFORMIX_MODE)
! 			$$ = cat_str(5, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), make_str(/*), mm_strdup(this-command), make_str(*/));
  		else
  			$$ = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/));
  	}
--- 318,338 
  		cur = this;
  
  		if (INFORMIX_MODE)
! 		{
! 			char *comment;
! 			const char *con = connection ? connection : NULL;
! 
! 			comment = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/));
! 
! 			if (braces_open  0) /* we're in a function */
! 			{
! char *command = (char *)mm_alloc(sizeof(ECPGreset_sqlca(__LINE__, );) + strlen(con));
! sprintf(command, ECPGreset_sqlca(__LINE__, %s);, con);
! $$ = cat_str(4, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), command, comment);
! 			}
! 			else
! $$ = cat_str(3, adjust_informix(this-argsinsert), adjust_informix(this-argsresult), comment);
! 		}
  		else
  			$$ = cat_str(3, make_str(/*), mm_strdup(this-command), make_str(*/));
  	}
diff -dcrpN pgsql.describe/src/interfaces/ecpg/preproc/ecpg.trailer pgsql.declare-reset-sqlca/src/interfaces/ecpg/preproc/ecpg.trailer
*** 

Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Hitoshi Harada
2009/8/8 Alvaro Herrera alvhe...@commandprompt.com:
 Олег Царев escribió:
 Hello all!
 If no one objecte (all agree, in other say) i continue work on patch -
 particulary, i want support second strategy (tuple store instead of
 hash-table) for save order of source (more cheap solution in case with
 grouping sets + order by), investigate and brainstorm another
 optimisation, writing regression tests and technical documentation.
 But I need some time for complete my investigation internals of
 PostgreSQL, particulary CTE.

 Where are we on this patch?  Is it moving forward?


It seems to me that the patch goes backward.

I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.

If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.

Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.

When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:

select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by b

so nothing better than query rewriting unless we invent something new.

But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.

Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Олег Царев
2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com:
 Олег Царев escribió:
 Hello all!
 If no one objecte (all agree, in other say) i continue work on patch -
 particulary, i want support second strategy (tuple store instead of
 hash-table) for save order of source (more cheap solution in case with
 grouping sets + order by), investigate and brainstorm another
 optimisation, writing regression tests and technical documentation.
 But I need some time for complete my investigation internals of
 PostgreSQL, particulary CTE.

 Where are we on this patch?  Is it moving forward?


 It seems to me that the patch goes backward.

 I looked trough the gsets-0.6.diff for about an hour, and found it is
 now only a syntax sugar that builds multiple GROUP BY queries based on
 CTE functionality. There's no executor modification.

 If I remember correctly, the original patch touched executor parts.
 I'd buy if the GROUPING SETS touches executor but I don't if this is
 only syntax sugar, because you can write it as the same by yourself
 without GROUPING SETS syntax. The motivation we push this forward is
 performance that cannot be made by rewriting query, I guess.

 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.

 When we want GROUPING SET(a, b), at first we sort by a and aggregate
 then sort by b and aggregate. This is the same as:

 select a, null, count(*) from x group by a
 union all
 select null, b, count(*) from x group by b

 so nothing better than query rewriting unless we invent something new.

 But in case of sub total and grand total like ROLLUP query, GroupAgg
 can do it by one-time scan by having multiple life cycle PerGroup
 state.

 Anyway, before going ahead we need to find rough sketch of how to
 implement this feature. Only syntax sugar is acceptable? Or internal
 executor support is necessary?


 Regards,


 --
 Hitoshi Harada


All rights, exclude
 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.
because group by it's optimized version of grouping sets.
Of course, we can extend the current definition of group by, but we
regress perfomance of it.
Some questions for you:

How calcualte aggregation on ROLLUP on single pass?
Stupid way - store different buffer of aggregations for every group,
and accumulate every record on group for every calculator. When a
group has changed, return key of this group to output set with  NULL
for fields not contains in this group, and restart current buffer of
aggregation.
Better way - add operation merge aggregations, and calculate one
buffer on every group, when group has cnahged - merge this main
buffer to other, and return some intermediate result.

I think, support this of grouping operation isn't simple, and
different implementation of ROLLUP it's better.

Regards, Tsarev Oleg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Pavel Stehule
2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com:
 Олег Царев escribió:
 Hello all!
 If no one objecte (all agree, in other say) i continue work on patch -
 particulary, i want support second strategy (tuple store instead of
 hash-table) for save order of source (more cheap solution in case with
 grouping sets + order by), investigate and brainstorm another
 optimisation, writing regression tests and technical documentation.
 But I need some time for complete my investigation internals of
 PostgreSQL, particulary CTE.

 Where are we on this patch?  Is it moving forward?


 It seems to me that the patch goes backward.

little bit.

 I looked trough the gsets-0.6.diff for about an hour, and found it is
 now only a syntax sugar that builds multiple GROUP BY queries based on
 CTE functionality. There's no executor modification.


I wrote older version in time when CTE wasn't implemented. The old
patch had own executor node, and was based on creating hash table per
group. This patch was maybe little bit faster than 0.6, but had lot of
bugs. Who knows planner code for grouping, then have to agree with me.
This code isn't readable and I wouldn't to it more complicated and
less readable. So I had idea, to join grouping sets with CTE. Grouping
sets is subset of CTE, so it is possible. Grouping sets is non
recursive CTE generally, and (I believe) this should be optimized
together.

I prefered using CTE, because this way was the most short to small
bugs less prototype - with full functionality.

 If I remember correctly, the original patch touched executor parts.
 I'd buy if the GROUPING SETS touches executor but I don't if this is
 only syntax sugar, because you can write it as the same by yourself
 without GROUPING SETS syntax. The motivation we push this forward is
 performance that cannot be made by rewriting query, I guess.


I don't thing, so you can do simply transformation from grouping sets
syntax to CTE. And what's more. Why you have optimized grouping sets
and not optimized non recursive CTE?

 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.


This way is possible too. But needs absolutely grouping planner and
executor reworking. Maybe is time do it. It is work for somebody other
to me. My place is stored procedures.

 When we want GROUPING SET(a, b), at first we sort by a and aggregate
 then sort by b and aggregate. This is the same as:

 select a, null, count(*) from x group by a
 union all
 select null, b, count(*) from x group by b

 so nothing better than query rewriting unless we invent something new.


the problem is when x is subquery. Then is better using CTE, because
we don't need repeat x evaluation twice. The most typical use case is,
so x isn't table.

 But in case of sub total and grand total like ROLLUP query, GroupAgg
 can do it by one-time scan by having multiple life cycle PerGroup
 state.


 Anyway, before going ahead we need to find rough sketch of how to
 implement this feature. Only syntax sugar is acceptable? Or internal
 executor support is necessary?

I thing, so both ways are possible. Probably the most clean way is
total refactoring of grouping executor and grouping planner. I am not
sure if we need new nodes. There are all. But these nodes cannot work
paralel now.




 Regards,


 --
 Hitoshi Harada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Hitoshi Harada
2009/8/14 Олег Царев zabiva...@gmail.com:
 All rights, exclude
 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.
 because group by it's optimized version of grouping sets.
 Of course, we can extend the current definition of group by, but we
 regress perfomance of it.
 Some questions for you:

 How calcualte aggregation on ROLLUP on single pass?

I'd imagine such like:

select a, b, count(*) from x group by rollup(a, b);

PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
while(row = fetch()){
  if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
  }
  if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
  }
  advance_agg(all, row);
  advance_agg(a, row);
  advance_agg(ab, row);
}
result_all = finalize_agg(all);

of course you should care best way to return result row and continue
aggregates and the number of grouping key varies from 1 to many, it is
quite possible. And normal GROUP BY is a case of key = a only, there
won't be performance regression.

 Better way - add operation merge aggregations, and calculate one
 buffer on every group, when group has cnahged - merge this main
 buffer to other, and return some intermediate result.

Merge aggregates sounds fascinating to me in not only this feature
but also partitioned table aggregates. But adding another function
(merge function?) to the current aggregate system is quite far way.


 I think, support this of grouping operation isn't simple, and
 different implementation of ROLLUP it's better.

Surely not simple. Adding another node is one of the choices, but from
code maintenance point of view I feel it is better to integrate it
into nodeAgg. nodeWindowAgg and nodeAgg have similar aggregate
processes but don't share it so a bug fix in nodeAgg isn't completed
in itself but we must re-check nodeWindowAgg also. To add another
agg-like node *may* be kind of nightmare.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Pavel Stehule
2009/8/13 Олег Царев zabiva...@gmail.com:
 2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/8 Alvaro Herrera alvhe...@commandprompt.com:
 Олег Царев escribió:
 Hello all!
 If no one objecte (all agree, in other say) i continue work on patch -
 particulary, i want support second strategy (tuple store instead of
 hash-table) for save order of source (more cheap solution in case with
 grouping sets + order by), investigate and brainstorm another
 optimisation, writing regression tests and technical documentation.
 But I need some time for complete my investigation internals of
 PostgreSQL, particulary CTE.

 Where are we on this patch?  Is it moving forward?


 It seems to me that the patch goes backward.

 I looked trough the gsets-0.6.diff for about an hour, and found it is
 now only a syntax sugar that builds multiple GROUP BY queries based on
 CTE functionality. There's no executor modification.

 If I remember correctly, the original patch touched executor parts.
 I'd buy if the GROUPING SETS touches executor but I don't if this is
 only syntax sugar, because you can write it as the same by yourself
 without GROUPING SETS syntax. The motivation we push this forward is
 performance that cannot be made by rewriting query, I guess.

 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.

 When we want GROUPING SET(a, b), at first we sort by a and aggregate
 then sort by b and aggregate. This is the same as:

 select a, null, count(*) from x group by a
 union all
 select null, b, count(*) from x group by b

 so nothing better than query rewriting unless we invent something new.

 But in case of sub total and grand total like ROLLUP query, GroupAgg
 can do it by one-time scan by having multiple life cycle PerGroup
 state.

 Anyway, before going ahead we need to find rough sketch of how to
 implement this feature. Only syntax sugar is acceptable? Or internal
 executor support is necessary?


 Regards,


 --
 Hitoshi Harada


 All rights, exclude
 Because GROUP BY we have today is a subset of GROUPING SETS by
 definition, I suppose we'll refactor nodeAgg.c so that it is allowed
 to take multiple group definitions. And we must support both of
 HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
 earlier patch does. For GroupAgg, it is a bit complicated since we
 sort by different key sets.
 because group by it's optimized version of grouping sets.
 Of course, we can extend the current definition of group by, but we
 regress perfomance of it.
 Some questions for you:

 How calcualte aggregation on ROLLUP on single pass?
 Stupid way - store different buffer of aggregations for every group,
 and accumulate every record on group for every calculator. When a
 group has changed, return key of this group to output set with  NULL
 for fields not contains in this group, and restart current buffer of
 aggregation.
 Better way - add operation merge aggregations, and calculate one
 buffer on every group, when group has cnahged - merge this main
 buffer to other, and return some intermediate result.


I don't thing, so this is possible for all operations. Don't forgot.
People can to implement own aggregates. example: weighted average

regards
Pavel Stehule

 I think, support this of grouping operation isn't simple, and
 different implementation of ROLLUP it's better.

 Regards, Tsarev Oleg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Josh Berkus
All,


 The other reason is what I think Greg Smith was mentioning --
 simplifying the process of grabbing a usable PITR backup for novice
 users.  That seems like it has merit.

While we're at this, can we add xlog_location as a file-location GUC?
It seems inconsistent that we're still requiring people to symlink the
pg_xlog in order to move that.  Or is that already part of this set of
patches?

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] surprising trigger/foreign key interaction

2009-08-13 Thread Josh Berkus
On 8/13/09 7:03 AM, Alvaro Herrera wrote:
 Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 
 I imagine this is so because of some old fiddling to get semantics just
 right for obscure corner cases, but it feels wrong nevertheless.
 I suspect it was reluctance to use the EvalPlanQual semantics (which
 are pretty bogus in their own way) for perfectly deterministic
 single-transaction cases.
 
 I suspect the FK trigger messing up the visibility is an obscure corner
 case too :-(

Yes, but it's one which happens frequently.

I've already had to debug a client case where a client had a before
trigger, and after trigger, and a self-join FK.  That seems like a
bizarre arrangement, but for a proximity tree (which we're going to see
a lot more of thanks to WITH RECURSIVE) it actually makes a lot of sense.

The result is that you can get a *successful* transaction, with no
error, that nevertheless results in rows which are inconsistent with the
FK -- silent data corruption.  I had to tell the user to disable the FK
and maintain consistency by trigger as well, which doesn't reflect well
on our devotion to avoiding data corruption.

This is 100% reproduceable; test case below my sig.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


-- create two tables, one of which is the master table (reftable) the
other of which is a child which contains a tree structure (treetab):

create table reftable(
refid int primary key,
refname text
);


create table treetab (
id int primary key,
parent int,
refid int not null references reftable(refid) on delete cascade,
name text
);

-- now create a trigger function to maintain the integrity of the trees
in treetab by pulling up
-- each node to its parent if intermediate nodes get deleted
-- this trigger is inherently flawed and won't work with the FK below

create function treemaint () returns trigger as $t$
begin
update treetab set parent = OLD.parent
where parent = OLD.id;
return OLD;
end; $t$ language plpgsql;

create trigger treemaint_trg before delete on treetab
for each row execute procedure treemaint();

-- populate reftable

insert into reftable
select i, ( 'Ref' || i::TEXT ) from generate_series(1,100) as g(i);

-- populate treetab with 10 rows each pointing to reftable

insert into treetab (id, refid)
select i, (( i / 10::INT ) + 1 )
from generate_series (1,900) as g(i);

-- create trees in treetab.  for this simple example each treeset is
just a chain with each child node
-- pointing to one higher node

update treetab set parent = ( id - 1 )
where id   (
select min(id) from treetab tt2
where tt2.refid = treetab.refid);

update treetab set name = ('tree' || parent::TEXT || '-' || id::TEXT);

-- now create a self-referential FK to enforce tree integrity.  This
logically breaks the trigger

alter table treetab add constraint selfref foreign key (parent)
references treetab (id);

-- show tree for id 45

select * from treetab where refid = 45;

 id  | parent | refid |name
-++---+-
 440 ||45 |
 441 |440 |45 | tree440-441
 442 |441 |45 | tree441-442
 443 |442 |45 | tree442-443
 444 |443 |45 | tree443-444
 445 |444 |45 | tree444-445
 446 |445 |45 | tree445-446
 447 |446 |45 | tree446-447
 448 |447 |45 | tree447-448
 449 |448 |45 | tree448-449


-- now, we're going to delete the tree.  This delete should fail with an
error because the
-- trigger will violate selfref

delete from reftable where refid = 45;

-- however, it doesn't fail.  it reports success, and some but not all
rows from treetab
-- are deleted, leaving the database in an inconsistent state.

select * from treetab where refid = 45;

 id  | parent | refid |name
-++---+-
 441 ||45 | tree440-441
 443 |441 |45 | tree442-443
 445 |443 |45 | tree444-445
 447 |445 |45 | tree446-447
 449 |447 |45 | tree448-449

-- this means we now have rows in the table which
-- violate the FK to reftable.

postgres=# select * from reftable where refid = 45;
 refid | refname
---+-
(0 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 While we're at this, can we add xlog_location as a file-location GUC?

That was proposed and rejected quite a long time ago.  We don't *want*
people to be able to just change a GUC and have their xlog go
somewhere else, because of the foot-gun potential.  You need to be sure
that the existing WAL files get moved over when you do something like
that, and the GUC infrastructure isn't up to ensuring that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY speedup

2009-08-13 Thread Pierre Frédéric Caillau d
In the previous mails I made a mistake, writing MTuples/s instead of  
MDatums/s, sorry about that. It is the number of rows x columns. The  
title was wrong, but the data was right.


I've been doing some tests on COPY FROM ... BINARY.

- inlines in various pg_get* etc
- a faster buffer handling for copy
- that's about it...

In the below tables, you have p17 (ie test patch 17, the last one) and  
straight postgres compared.


COPY annonces_2 FROM 'annonces.bin' BINARY  :
  Time | Speedup |  Table |  KRows | MDatums | Name
   (s) | |   MB/s | /s |  /s |
---|-|||-|
 8.417 |  1.40 x |  38.70 |  49.13 |1.92 | 8.4.0 / p17
11.821 | --- |  27.56 |  34.98 |1.36 | 8.4.0 / compiled from source


COPY archive_data_2 FROM 'archive_data.bin' BINARY  :
  Time | Speedup | Table |  KRows | MDatums | Name
   (s) | |  MB/s | /s |  /s |
---|-|---||-|
15.314 |  1.93 x | 25.94 | 172.88 |4.84 | 8.4.0 / p17 COPY FROM BINARY  
all

29.520 | --- | 13.46 |  89.69 |2.51 | 8.4.0 / compiled from source


COPY test_one_int_2 FROM 'test_one_int.bin' BINARY  :
  Time | Speedup |  Table |   KRows | MDatums | Name
   (s) | |   MB/s |  /s |  /s |
---|-||-|-|
10.003 |  1.39 x |  30.63 |  999.73 |1.00 | 8.4.0 / p17 COPY FROM  
BINARY all
13.879 | --- |  22.08 |  720.53 |0.72 | 8.4.0 / compiled from  
source



COPY test_many_ints_2 FROM 'test_many_ints.bin' BINARY  :
  Time | Speedup | Table |  KRows | MDatums | Name
   (s) | |  MB/s | /s |  /s |
---|-|---||-|
 6.009 |  2.08 x | 21.31 | 166.42 |4.33 | 8.4.0 / p17 COPY FROM BINARY  
all

12.516 | --- | 10.23 |  79.90 |2.08 | 8.4.0 / compiled from source


I thought it might be interesting to get split timings of the various  
steps in COPY FROM, so I simply commented out bits of code and ran tests.


The delta columns are differences between two lines, that is the time  
taken in the step mentioned on the right.



reading data only = reading all the data and parsing it into chunks, doing  
everything until the RecvFunc is called.

RecvFuncs = same, + RecvFunc is called
heap_form_tuple = same + heap_form_tuple is called
triggers = same + triggers are applied
insert = actual tuple insertion
p17 = total time (post insert triggers, constraint check, etc)

  Time | Delta | Row delta | Datum delta | Name
   (s) |   (s) |  (us) |(us) |
---|---|---|-|--
 1.311 |   --- |   --- | --- | reading data only
 4.516 | 3.205 | 7.750 |   0.199 | RecvFuncs
 4.534 | 0.018 | 0.043 |   0.001 | heap_form_tuple
 5.323 | 0.789 | 1.908 |   0.049 | triggers
 8.182 | 2.858 | 6.912 |   0.177 | insert
 8.417 | 0.236 | 0.570 |   0.015 | p17


COPY archive_data_2 FROM 'archive_data.bin' BINARY  :
  Time |  Delta | Row delta | Datum delta | Name
   (s) |(s) |  (us) |(us) |
---||---|-|-
 4.729 |--- |   --- | --- | reading data only
 8.508 |  3.778 | 1.427 |   0.051 | RecvFuncs
 8.567 |  0.059 | 0.022 |   0.001 | heap_form_tuple
10.804 |  2.237 | 0.845 |   0.030 | triggers
14.475 |  3.671 | 1.386 |   0.050 | insert
15.314 |  0.839 | 0.317 |   0.011 | p17


COPY test_one_int_2 FROM 'test_one_int.bin' BINARY  :
  Time | Delta | Row delta | Datum delta | Name
   (s) |   (s) |  (us) |(us) |
---|---|---|-|--
 1.247 |   --- |   --- | --- | reading data only
 1.745 | 0.498 | 0.050 |   0.050 | RecvFuncs
 1.750 | 0.004 | 0.000 |   0.000 | heap_form_tuple
 3.114 | 1.364 | 0.136 |   0.136 | triggers
 9.984 | 6.870 | 0.687 |   0.687 | insert
10.003 | 0.019 | 0.002 |   0.002 | p17


COPY test_many_ints_2 FROM 'test_many_ints.bin' BINARY  :
  Time | Delta | Row delta | Datum delta | Name
   (s) |   (s) |  (us) |(us) |
---|---|---|-|--
 1.701 |   --- |   --- | --- | reading data only
 3.122 | 1.421 | 1.421 |   0.055 | RecvFuncs
 3.129 | 0.008 | 0.008 |   0.000 | heap_form_tuple
 3.754 | 0.624 | 0.624 |   0.024 | triggers
 5.639 | 1.885 | 1.885 |   0.073 | insert
 6.009 | 0.370 | 0.370 |   0.014 | p17

We can see that :

- reading and parsing the data is still slow (actually, everything is  
copied something like 3-4 times)

- RecvFuncs take quite long, too
- triggers use some time, although the table has no triggers ? This is  

Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Josh Berkus
Tom,

 That was proposed and rejected quite a long time ago.  We don't *want*
 people to be able to just change a GUC and have their xlog go
 somewhere else, because of the foot-gun potential.  You need to be sure
 that the existing WAL files get moved over when you do something like
 that, and the GUC infrastructure isn't up to ensuring that.

Doesn't the same argument apply to data_directory?

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 That was proposed and rejected quite a long time ago.  We don't *want*
 people to be able to just change a GUC and have their xlog go
 somewhere else, because of the foot-gun potential.  You need to be sure
 that the existing WAL files get moved over when you do something like
 that, and the GUC infrastructure isn't up to ensuring that.

 Doesn't the same argument apply to data_directory?

No.  Changing data_directory might result in failure to start (if
you didn't move the actual data over there) but it's unlikely to result
in irretrievable corruption of your data.  The key issue here is the
need to keep data and xlog in sync, and moving the whole data directory
doesn't create risks of that sort.

Now admittedly it's not hard to screw yourself with a careless manual
move of xlog, either.  But at least the database didn't hand you a knob
that invites clueless frobbing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Hitoshi Harada
2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 I prefered using CTE, because this way was the most short to small
 bugs less prototype - with full functionality.

You could make it by query rewriting, but as you say the best cleanest
way is total refactoring of existing nodeAgg. How easy to implement is
not convincing.

 When we want GROUPING SET(a, b), at first we sort by a and aggregate
 then sort by b and aggregate. This is the same as:

 select a, null, count(*) from x group by a
 union all
 select null, b, count(*) from x group by b

 so nothing better than query rewriting unless we invent something new.

 the problem is when x is subquery. Then is better using CTE, because
 we don't need repeat x evaluation twice. The most typical use case is,
 so x isn't table.

So we need single scan aggregate as far as possible. Buffering
subquery's result is possible without CTE node. Tuplestore has that
functionality but I found the buffered result will be sorted multiple
times, one way might be to allow tuplesort to perform sort multiple
times with different keys.



Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Josh Berkus

 Now admittedly it's not hard to screw yourself with a careless manual
 move of xlog, either.  But at least the database didn't hand you a knob
 that invites clueless frobbing.

So really rather than a GUC we should have a utility for moving the xlog.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Now admittedly it's not hard to screw yourself with a careless manual
 move of xlog, either.  But at least the database didn't hand you a knob
 that invites clueless frobbing.

 So really rather than a GUC we should have a utility for moving the xlog.

Yeah, that would work.  Although it would probably take as much verbiage
to document the utility as it does to document how to do it manually.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Pavel Stehule
2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 I prefered using CTE, because this way was the most short to small
 bugs less prototype - with full functionality.

 You could make it by query rewriting, but as you say the best cleanest
 way is total refactoring of existing nodeAgg. How easy to implement is
 not convincing.


I agree. Simply I am not have time and force do it. I would to
concentrate on finishing some plpgsql issues, and then I have to do
some other things than PostgreSQL. There are fully functional
prototype and everybody is welcome to continue in this work.


 When we want GROUPING SET(a, b), at first we sort by a and aggregate
 then sort by b and aggregate. This is the same as:

 select a, null, count(*) from x group by a
 union all
 select null, b, count(*) from x group by b

 so nothing better than query rewriting unless we invent something new.

 the problem is when x is subquery. Then is better using CTE, because
 we don't need repeat x evaluation twice. The most typical use case is,
 so x isn't table.

 So we need single scan aggregate as far as possible. Buffering
 subquery's result is possible without CTE node. Tuplestore has that
 functionality but I found the buffered result will be sorted multiple
 times, one way might be to allow tuplesort to perform sort multiple
 times with different keys.


yes, I don't afraid multiple evaluation of aggregates. It's cheap.
Problem is multiple table scan. I though about some new version of
aggregates. Current aggregates process row by row with final
operation. Some new kind of aggregates should to work over tuple
store. Internally it get pointer to tupplestore and number of rows.
This should be very fast for functions like median, or array_agg. I
thing so it's similar to window functions - only it not window
function.

If you like to optimalize to speed, then the most faster solution will
be using hash tables - then you don't need tuplestore. For rollup is
possible maybe one single scan - but I am not sure - there are
important fakt - final function cannot modify intermediate data.

Pavel



 Regards,


 --
 Hitoshi Harada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink

2009-08-13 Thread David Fetter
On Thu, Aug 13, 2009 at 02:01:19PM +0300, Heikki Linnakangas wrote:
 Itagaki Takahiro wrote:
  Present dblink is a thin wrapper of libpq, but some of my customers
  want automatic transaction managements. Remote transactions are
  committed with 2PC when the local transaction is committed.
  To achieve it, I think we need on-commit trigger is needed,
  but it is hard to implement with current infrastructure.
  (That is one of the reason I proposed to merge dblink into core.)
 
 Quite aside from the requirement for on-commit trigger, how exactly
 would you use 2PC with the remote database? When would you issue PREPARE
 TRANSACTION, and when would COMMIT PREPARED?

For what it's worth, in DBI-Link, I've allowed some of this by letting
people pass commands like BEGIN, COMMIT and ROLLBACK through to the
remote side.  However, it doesn't--can't, as far as I know--implement
the full 2PC.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-08-13 Thread Peter Eisentraut
On Thursday 13 August 2009 18:07:51 Alvaro Herrera wrote:
 Oleg Bartunov wrote:
  Peter,
 
  how to write accented characters in sgml ? Is't not allowed to write
  them as is ?

 aacute; for á, etc.  You can't use characters that aren't in Latin-1 I
 think. Writing them literally is not allowed.

It's somehow possible, but it's not as straightforward as say with XML.  And 
you might get into a Latin-1 vs UTF-8 mixup.  At least that's what I noticed 
in my limited testing the other day.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Josh Berkus

 Yeah, that would work.  Although it would probably take as much verbiage
 to document the utility as it does to document how to do it manually.

Yes, but it would *feel* less hackish to sysadmins and DBAs, and make
them more confident about moving the xlogs.

Getting it to work on windows will be a pita, though ... Andrew?

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Andrew Dunstan



Josh Berkus wrote:

Yeah, that would work.  Although it would probably take as much verbiage
to document the utility as it does to document how to do it manually.



Yes, but it would *feel* less hackish to sysadmins and DBAs, and make
them more confident about moving the xlogs.

Getting it to work on windows will be a pita, though ... Andrew?
  


Why would it? All the tools are there - if not tablespaces wouldn't work.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-08-13 Thread Bruce Momjian
Peter Eisentraut wrote:
 On Thursday 13 August 2009 18:07:51 Alvaro Herrera wrote:
  Oleg Bartunov wrote:
   Peter,
  
   how to write accented characters in sgml ? Is't not allowed to write
   them as is ?
 
  aacute; for ?, etc.  You can't use characters that aren't in Latin-1 I
  think. Writing them literally is not allowed.
 
 It's somehow possible, but it's not as straightforward as say with XML.  And 
 you might get into a Latin-1 vs UTF-8 mixup.  At least that's what I noticed 
 in my limited testing the other day.

The top of release.sgml has instructions on that because that is often
something we need to do for names in release notes:

non-ASCII charactersconvert to HTML4 entity () escapes

official:  http://www.w3.org/TR/html4/sgml/entities.html
one page:  
http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
other lists:   
http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html
   
http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
   
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

we cannot use UTF8 because SGML Docbook
does not support it
  http://www.pemberley.com/janeinfo/latin1.html#latexta

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 release notes

2009-08-13 Thread Bruce Momjian
Tom Lane wrote:
 Massa, Harald Armin c...@ghum.de writes:
  within source code, build options there is:
  - Reserve the shared memory region during backend startup on Windows,
so that memory allocated by starting third party DLLs doesn't end up
conflicting with it.  Hopefully this solves the long-time issue with
could not reattach to shared memory errors on Win32.
 
  I suggest that it should also be pointed out that this fix will be
  backported to 8.3 and 8.4 (as much as I followed the ML);
 
 Normally, bug fixes that have been back-patched wouldn't be mentioned at
 all in a new major release's release notes.  The implied base that we
 are comparing to in major-release notes is the end of the prior branch's
 updates.  I'm not sure if this case should be an exception, or if we
 should have a different general rule for alpha releases.  We'd like to
 get more testing on that fix, so I think it is reasonable to mention it
 for alpha1 --- but is that an exception specific to this bug fix, or
 does it indicate we want to handle bug fixes differently in general
 within alpha release notes?
 
 In any case, it is not the function of the alpha release notes to
 discuss changes in earlier release branches.  The reason the commit
 log points out the back-patch is to make it easier to extract the
 information when we prepare release notes for the back-branch updates.

FYI, tools/pgcvslog -d removes backbranch commits automatically.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot standby and synchronous replication status

2009-08-13 Thread Jaime Casanova
On Thu, Aug 13, 2009 at 1:49 PM, Josh Berkusj...@agliodbs.com wrote:

 Yeah, that would work.  Although it would probably take as much verbiage
 to document the utility as it does to document how to do it manually.

 Yes, but it would *feel* less hackish to sysadmins and DBAs, and make
 them more confident about moving the xlogs.


and is better for marketing... in fact, when i say we need to move
them manually with a symlink sysadmins looks to me like an strange bug
;)

 Getting it to work on windows will be a pita, though ... Andrew?


mmm... is there a way to make this *manually* in windows? maybe this
is enough reason for a tool to make it...

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
[ moving to -hackers ]

If this topic has been discussed previously, please point me to the
earlier threads.

Why aren't we more opportunistic about freezing tuples? For instance, if
we already have a dirty buffer in cache, we should be more aggressive
about freezing those tuples than freezing tuples on disk.

I looked at the code, and it looks like if we freeze one tuple on the
page during VACUUM, we mark it dirty. Wouldn't that be a good
opportunity to freeze all the other tuples on the page that we can?

Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
opportunities to set hint bits or freeze tuples.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Getting rid of the flat authentication file

2009-08-13 Thread Tom Lane
I've been looking into what it would take to eliminate the flat file for
pg_auth info.  The implication of doing that is that authentication has
to be postponed until inside InitPostgres(), where we can read the
actual system catalogs instead.

The easy way to do it would be to postpone authentication until after we
have selected and entered a database.  At that point we could use existing
code such as is_member_of_role().  There is a security disadvantage to
that: you would find out whether the database name you'd given was valid
before any authentication check occurred.  Since database names are often
also user names, that would give a brute-force attacker a leg up on
discovering valid user names.  Plan B is to use the same techniques for
reading pg_authid and pg_auth_members as InitPostgres is now using for
reading pg_database.  That's perfectly doable; the main downside to it
is that if the shared relcache file were missing, we'd be reduced to
seqscan searches of these files, which could be pretty darn unpleasant
for role membership searches.  However, the shared relcache file should
hardly ever be missing, and standard pg_hba.conf setups (with the role
column always ALL) don't result in role membership checks anyway.  So
I'm leaning to plan B here.

Another issue is that currently, option switches supplied via PGOPTIONS
are processed at entry to PostgresMain (unless they are for SUSET GUC
variables).  If we retained that behavior then they'd be applied before
authentication occurred.  This worries me, though I can't immediately
point to a problem case.  I'd be inclined to postpone the processing of
all user-supplied switches until after InitPostgres.  This would
simplify the logic in PostgresMain, too, since we'd not have to process
SUSET variables separately from others.  The only real downside I can
see is that it would make -W (post_auth_delay) pretty much useless for
its intended purpose of assisting debugging of InitPostgres-time
problems.  We might as well remove it and just rely on pre_auth_delay.
This point is only of interest to hackers, and not all that often even
to us, so I don't feel that it's a critical objection.

So the disadvantages of not using the flat file for authentication
seem to boil down to
* more cycles expended before we can reject a bad username/password
* could be slow in the uncommon case that the shared relcache file is missing
* debugging InitPostgres problems will get more inconvenient
As against this, we'd be getting rid of a bunch of klugy, slow code with
assorted failure points.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Alvaro Herrera
Jeff Davis wrote:

 Why aren't we more opportunistic about freezing tuples? For instance, if
 we already have a dirty buffer in cache, we should be more aggressive
 about freezing those tuples than freezing tuples on disk.

The most widely cited reason is that you lose forensics data.  Although
they are increasingly rare, there are still situations in which the heap
tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
the best/only way to find out what happened and thus fix the bug.  If
you freeze early, there's just no way to know.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age100m? )

2009-08-13 Thread Kevin Grittner
Alvaro Herrera alvhe...@commandprompt.com wrote: 
 Jeff Davis wrote:
 
 Why aren't we more opportunistic about freezing tuples? For
 instance, if we already have a dirty buffer in cache, we should be
 more aggressive about freezing those tuples than freezing tuples on
 disk.
 
 The most widely cited reason is that you lose forensics data. 
 Although they are increasingly rare, there are still situations in
 which the heap tuple machinery messes up and the xmin/xmax/etc
 fields of the tuple are the best/only way to find out what happened
 and thus fix the bug.  If you freeze early, there's just no way to
 know.
 
Although I find it hard to believe that this is compelling argument in
the case where an entire table or database is loaded in a single
database transaction.
 
In the more general case, I'm not sure why this argument applies here
but not to cassert and other diagnostic options.  It wouldn't surprise
me to find workloads where writing data three times (once for the
data, once for hint bits, and once to freeze the tid) affects
performance more than cassert.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 17:58 -0400, Alvaro Herrera wrote:
 The most widely cited reason is that you lose forensics data.  Although
 they are increasingly rare, there are still situations in which the heap
 tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
 the best/only way to find out what happened and thus fix the bug.  If
 you freeze early, there's just no way to know.

As it stands, it looks like it's not just one extra write for each
buffer, but potentially many (theoretically, as many as there are tuples
on a page). I suppose the reasoning is that tuples on the same page have
approximately the same xmin, and are likely to be frozen at the same
time. But it seems entirely reasonable that the xmins on one page span
several VACUUM runs, and that seems more likely with the FSM. That means
that a few tuples on the page are older than 100M and get frozen, and
the rest are only about 95M transactions old, so we have to come back
and freeze them again, later.

Let's say that we had a range like 50-100M, where if it's older than
100M, we freeze it, and if it's older than 50M we freeze it only if it's
on a dirty page. We would still have forensic evidence, but we could
make a range such that we avoid writing multiple times.

And people who don't care about forensic evidence can set it to 0-100M.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 17:17 -0500, Kevin Grittner wrote:
 It wouldn't surprise
 me to find workloads where writing data three times (once for the
 data, once for hint bits, and once to freeze the tid)

I'm not sure that we're limited to 3 times, here. I could be missing
something, but if you have tuples with different xmins on the same page,
some might be older than 100M, which you freeze, and then you will have
to come back later to freeze the rest. As far as I can tell, the maximum
number of writes is the number of tuples that fit on the page.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Robert Haas
On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote:
 Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
 opportunities to set hint bits or freeze tuples.

One of the tricky things here is that the time you are mostly likely
to want to do this is when you are loading a lot of data.  But in that
case shared buffers are likely to be written back to disk before
transaction commit, so it'll be too early to do anything.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus

 Why aren't we more opportunistic about freezing tuples? For instance, if
 we already have a dirty buffer in cache, we should be more aggressive
 about freezing those tuples than freezing tuples on disk.
 
 The most widely cited reason is that you lose forensics data.  Although
 they are increasingly rare, there are still situations in which the heap
 tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
 the best/only way to find out what happened and thus fix the bug.  If
 you freeze early, there's just no way to know.

That argument doesn't apply.  If the page is in memory and is being
written anyway, and some of the rows are past vacuum_freeze_min_age,
then why not freeze them rather than waiting for a vacuum process to
read them off disk and rewrite them?

We're not talking about freezing every tuple as soon as it's out of
scope.  Just the ones which are more that 100m (or whatever the setting
is) old.  I seriously doubt that anyone is doing useful forensics using
xids which are 100m old.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Let's say that we had a range like 50-100M, where if it's older than
 100M, we freeze it, and if it's older than 50M we freeze it only if it's
 on a dirty page. We would still have forensic evidence, but we could
 make a range such that we avoid writing multiple times.

Yeah, making the limit slushy would doubtless save some writes, with
not a lot of downside.

 And people who don't care about forensic evidence can set it to 0-100M.

Everybody *thinks* they don't care about forensic evidence.  Until they
need it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
 Yeah, making the limit slushy would doubtless save some writes, with
 not a lot of downside.

OK, then should we make this a TODO? I'll make an attempt at this.

  And people who don't care about forensic evidence can set it to 0-100M.
 
 Everybody *thinks* they don't care about forensic evidence.  Until they
 need it.

We already allow setting vacuum_freeze_min_age to zero, so I don't see a
solution here other than documentation.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
 Everybody *thinks* they don't care about forensic evidence.  Until they
 need it.

 We already allow setting vacuum_freeze_min_age to zero, so I don't see a
 solution here other than documentation.

Yeah, we allow it.  I just don't want to encourage it ... and definitely
not make it default.

What are you envisioning exactly?  If vacuum finds any reason to dirty
a page (or it's already dirty), then freeze everything on the page that's
got age  some lower threshold?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus
Jeff, Tom,

 Let's say that we had a range like 50-100M, where if it's older than
 100M, we freeze it, and if it's older than 50M we freeze it only if it's
 on a dirty page. We would still have forensic evidence, but we could
 make a range such that we avoid writing multiple times.
 
 Yeah, making the limit slushy would doubtless save some writes, with
 not a lot of downside.

This would mean two settings: vacuum_freeze_min_age and
vacuum_freeze_dirty_age.  And we'd need to add those to the the
autovacuum settings for each table as well.  While we could just make
one setting 1/2 of the other, that prevents me from saying:

freeze this table agressively if it's in memory, but wait a long time
to vaccuum if it's on disk

I can completely imagine a table which has a vacuum_freeze_dirty_age of
1 and a vacuum_freeze_min_age of 1m.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus

 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

I was envisioning, if the page is already dirty and in memory *for any
reason*, the freeze rows at below some threshold.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_hba.conf: samehost and samenet

2009-08-13 Thread Stef Walter
I love using postgresql, and have for a long time. I'm involved with
almost a hundred postgresql installs. But this is the first time I've
gotten into the code.

Renumbering networks happens often, and will happen more frequently as
IPv4 space runs low. The IP based restrictions in pg_hba.conf is one of
the places where renumbering can break running installs. In addition
when postgresql is run in BSD jails, 127.0.0.1 is not available for use
in pg_hba.conf.

It would be great if, in the cidr-address field of pg_hba.conf, we could
specify samehost and samenet. These special values use the local
hosts network interface addresses. samehost allows an IP assigned to
the local machine. samenet allows any host on the subnets connected to
the local machine.

This is similar to the sameuser value that's allowed in the database
field.

A change like this would enable admins like myself to distribute
postgresql with something like this in the default pg_hba.conf file:

host  all all   samenet md5
hostssl   all all   0.0.0.0/0   md5

I've attached an initial patch which implements samehost and
samenet. The patch looks more invasive than it really is, due to
necessary indentation change (ie: a if block), and moving some code into
a separate function.

Thanks for your time. How can I help get a feature like this into
postgresql?

Cheers,

Stef
diff --git a/configure b/configure
index 61b3c72..7bcfcec 100755
*** a/configure
--- b/configure
*** done
*** 9642,9648 
  
  
  
! for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h
  do
  as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh`
  if { as_var=$as_ac_Header; eval test \\${$as_var+set}\ = set; }; then
--- 9642,9649 
  
  
  
! 
! for ac_header in crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h ifaddrs.h
  do
  as_ac_Header=`$as_echo ac_cv_header_$ac_header | $as_tr_sh`
  if { as_var=$as_ac_Header; eval test \\${$as_var+set}\ = set; }; then
*** fi
*** 17278,17284 
  
  
  
! for ac_func in cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs
  do
  as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh`
  { $as_echo $as_me:$LINENO: checking for $ac_func 5
--- 17279,17286 
  
  
  
! 
! for ac_func in cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs getifaddrs
  do
  as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh`
  { $as_echo $as_me:$LINENO: checking for $ac_func 5
diff --git a/configure.in b/configure.in
index 505644a..bc37b1b 100644
*** a/configure.in
--- b/configure.in
*** AC_SUBST(OSSP_UUID_LIBS)
*** 962,968 
  ##
  
  dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES
! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h])
  
  # At least on IRIX, cpp test for netinet/tcp.h will fail unless
  # netinet/in.h is included first.
--- 962,968 
  ##
  
  dnl sys/socket.h is required by AC_FUNC_ACCEPT_ARGTYPES
! AC_CHECK_HEADERS([crypt.h dld.h fp_class.h getopt.h ieeefp.h langinfo.h poll.h pwd.h sys/ipc.h sys/poll.h sys/pstat.h sys/resource.h sys/select.h sys/sem.h sys/socket.h sys/shm.h sys/tas.h sys/time.h sys/un.h termios.h ucred.h utime.h wchar.h wctype.h kernel/OS.h kernel/image.h SupportDefs.h ifaddrs.h])
  
  # At least on IRIX, cpp test for netinet/tcp.h will fail unless
  # netinet/in.h is included first.
*** PGAC_VAR_INT_TIMEZONE
*** 1141,1147 
  AC_FUNC_ACCEPT_ARGTYPES
  PGAC_FUNC_GETTIMEOFDAY_1ARG
  
! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs])
  
  # posix_fadvise() is a no-op on Solaris, so don't incur function overhead
  # by calling it, 2009-04-02
--- 1141,1147 
  AC_FUNC_ACCEPT_ARGTYPES
  PGAC_FUNC_GETTIMEOFDAY_1ARG
  
! AC_CHECK_FUNCS([cbrt dlopen fcvt fdatasync getpeereid getpeerucred getrlimit memmove poll pstat readlink setproctitle setsid sigprocmask symlink sysconf towlower utime utimes waitpid wcstombs getifaddrs])
  
  # posix_fadvise() is a 

Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 19:05 -0400, Tom Lane wrote:
 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

Yes. There are two ways to do the threshold:
  1. Constant fraction of vacuum_freeze_min_age
  2. Extra GUC

I lean toward #1, because it avoids an extra GUC*, and it avoids the
awkwardness when the lower setting is higher than the higher
setting.

However, #2 might be nice for people who want to live on the edge or
experiment with new values. But I suspect most of the advantage would be
had just by saying that we opportunistically freeze tuples older than
50% of vacuum_freeze_min_age.

Regards,
Jeff Davis

*: As an aside, these GUCs already have incredibly confusing names, and
an extra variable would increase the confusion. For instance, they seem
to use min and max interchangeably.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 18:25 -0400, Robert Haas wrote:
 On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote:
  Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
  opportunities to set hint bits or freeze tuples.
 
 One of the tricky things here is that the time you are mostly likely
 to want to do this is when you are loading a lot of data.  But in that
 case shared buffers are likely to be written back to disk before
 transaction commit, so it'll be too early to do anything.

I think it would be useful in other cases, like avoiding repeated
freezing of different tuples on the same page.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

 I was envisioning, if the page is already dirty and in memory *for any
 reason*, the freeze rows at below some threshold.

I believe we've had this discussion before.  I do *NOT* want freezing
operations pushed into any random page access, and in particular will
do my best to veto any attempt to put them into the bgwriter.  Freezing
requires accessing the clog and emitting a WAL record, and neither is
appropriate for low-level code like bgwriter.  The deadlock potential
alone is sufficient reason why not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Greg Stark
On Fri, Aug 14, 2009 at 12:07 AM, Josh Berkusj...@agliodbs.com wrote:
 freeze this table agressively if it's in memory, but wait a long time
 to vaccuum if it's on disk

Waitasec, in memory?

There are two projects here:

1) Make vacuum when it's freezing tuples freeze every tuple  lesser
age if it finds any tuples which are  max_age (or I suppose if the
page is already dirty due to vacuum or something else). Vacuum still
has to read in all the pages before it finds out that they don't need
freezing so it doesn't mean distinguishing in memory from needs to
be read in.

2) Have something like bgwriter check if the page is dirty and vacuum
and freeze things based on the lesser threshold. This would
effectively only be vacuuming things that are in memory

However the latter is a more complex and frought project. We looked at
this a while back in EDB and we found that the benefits were less than
we expected and the complexities more than we expected.  I would
recommend sticking with (1) for now and only looking at (2) if we have
a more detailed plan and solid testable use cases.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Greg Stark
On Fri, Aug 14, 2009 at 12:21 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 I was envisioning, if the page is already dirty and in memory *for any
 reason*, the freeze rows at below some threshold.

 I believe we've had this discussion before.  I do *NOT* want freezing
 operations pushed into any random page access, and in particular will
 do my best to veto any attempt to put them into the bgwriter.

It's possible Josh accidentally waved this red flag and really meant
just to make it conditional on whether the page is dirty rather than
on whether vacuum dirtied it.

However he did give me a thought

With the visibility map vacuum currently only covers pages that are
known to have in-doubt tuples. That's why we have the anti-wraparound
vacuums. However it could also check if the pages its skipping are in
memory and process them if they are even if they don't have in-doubt
tuples.

Or it could first go through ram and process any pages that are in
cache before going to the visibility map and starting from page 0,
which would hopefully avoid having to read them in later when we get
to them and find they've been flushed out.

I'm just brainstorming here. I'm not sure if either of these are
actually worth the complexity and danger of finding new bottlenecks in
special case optimization codepaths.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink

2009-08-13 Thread Itagaki Takahiro

Alvaro Herrera alvhe...@commandprompt.com wrote:

  int64   exec(self, query);/* for UPDATE, INSERT, DELETE 
  */
 
 It's not good to return int64 in exec(), because it could have a
 RETURNING clause.  (So it also needs a fetchsize).

We should use open() for RETURNING query.
It is just same as present dblink_exec(), that only returns a command tag.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Hitoshi Harada
2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 I prefered using CTE, because this way was the most short to small
 bugs less prototype - with full functionality.

 You could make it by query rewriting, but as you say the best cleanest
 way is total refactoring of existing nodeAgg. How easy to implement is
 not convincing.


 I agree. Simply I am not have time and force do it. I would to
 concentrate on finishing some plpgsql issues, and then I have to do
 some other things than PostgreSQL. There are fully functional
 prototype and everybody is welcome to continue in this work.


I see your situation. Actually your prototype is good shape to be
discussed in both ways. But since you've been focusing on this feature
it'd be better if you keep your eyes on this.

So, Oleg, do you continue on this?


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Олег Царев
2009/8/14 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 I prefered using CTE, because this way was the most short to small
 bugs less prototype - with full functionality.

 You could make it by query rewriting, but as you say the best cleanest
 way is total refactoring of existing nodeAgg. How easy to implement is
 not convincing.


 I agree. Simply I am not have time and force do it. I would to
 concentrate on finishing some plpgsql issues, and then I have to do
 some other things than PostgreSQL. There are fully functional
 prototype and everybody is welcome to continue in this work.


 I see your situation. Actually your prototype is good shape to be
 discussed in both ways. But since you've been focusing on this feature
 it'd be better if you keep your eyes on this.

 So, Oleg, do you continue on this?


 Regards,


 --
 Hitoshi Harada


 I'd imagine such like:

 select a, b, count(*) from x group by rollup(a, b);

 PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
 while(row = fetch()){
  if(group_is_changed(ab, row)){
result_ab = finalize_agg(ab);
ab = init_agg();
  }
  if(group_is_changed(a, row)){
result_a = finalize_agg(a);
a = init_agg();
  }
  advance_agg(all, row);
  advance_agg(a, row);
  advance_agg(ab, row);
 }
 result_all = finalize_agg(all);
Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
Also, multiply sort of source we take for CUBE implementation, but
this hard for support (sort in group by - it's bloat).
As result we have merge implementation of group by, rollup, and window
functions with some common code - it's way for grouping of source,
Hash implementation group xxx on different hash-tables (with different
keys) it's very expensive (require many memory for keys).
I hope continue my work, after end of time trouble on work =( (bad
TPC-H perfomance)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW-based dblink

2009-08-13 Thread Itagaki Takahiro

Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:

 Quite aside from the requirement for on-commit trigger, how exactly
 would you use 2PC with the remote database? When would you issue PREPARE
 TRANSACTION, and when would COMMIT PREPARED? What if the local database
 crashes in between - is the remote transaction left hanging in prepared
 state?

I'm thinking prepareing remote transactions just before commit the local
transaction in CommitTransaction(). The pseudo code is something like:

1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink()
= prepare and commit remote transactions.
3. HOLD_INTERRUPTS()
We cannot rollback the local transaction after this.
4. do works for commit

If we need more robust atomicity, we could use 2PC against the local
transaction if there some remote transactions. i.e., expand COMMIT
command into PREPARE TRANSACTION and COMMIT PREPARED internally:

1. Fire deferred triggers and do works for just-before-commit.
2. AtEOXact_dblink_prepare()-- prepare remotes
3. PrepareTransaction() -- prepare local
4. AtEOXact_dblink_commit() -- commit remotes
5. FinishPreparedTransaction(commit)-- commit local

I'm using deferrable after trigger for the purpose in my present
prototype, and it seems to work if the trigger is called at the
end of deferrable event and local backend doesn't crash in final
works for commit -- and we have some should-not-failed operations
in the final works already  (flushing WAL, etc.).

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Custom geometry, why slow?

2009-08-13 Thread Paul Matthews
The story so far ... The provide polygon@point routine does not work
correctly when the points are close to the boundary. So we implemented a
custom contains(poly,point) function. In order to stop all points being
checked against all polygons, a separate bounding box is maintained. So
the query has sections looking like :

   boundbox @ box( thepoint, thepoint ) AND
   contains(boundary,thepoint)

You will notice that each point to be checked has to be promoted to a
degenerate box. Working on the assumption that there is a cost
associated with this (ie pmalloc), and we will be passing 100's of
millions of points though this in a single transaction, streaming this
is important. At any rate it looked kludgy. The goal is provide :

   boundbox @ thepoint AND
   contains(boundary,thepoint)

So the whole family of point op box functions where provided (except
for point @ box) which already exists. The operators have been created.
And the operators added to the box_ops operator family. Samples below :

  CREATE OR REPLACE FUNCTION leftof(box,point) RETURNS boolean
  LANGUAGE C IMMUTABLE STRICT
  AS 'contains.so', 'box_point_leftof';
  ..etc...

  DROP OPERATOR IF EXISTS (box,point);
  CREATE OPERATOR  (
LEFTARG= box,
RIGHTARG   = point,
PROCEDURE  = leftof,
RESTRICT   = positionsel,
JOIN   = positionjoinsel
  );
  ...etc...

  ALTER OPERATOR FAMILY box_ops USING GiST ADD
OPERATOR 1(box,point),
OPERATOR 2(box,point),
OPERATOR 3(box,point),
OPERATOR 4(box,point),
OPERATOR 5(box,point), 
OPERATOR 7  @  (box,point),
  --OPERATOR 8  @  (point,box),
OPERATOR 9  | (box,point),
OPERATOR 10 | (box,point),
OPERATOR 11 | (box,point),  
OPERATOR 12 | (box,point);  

The problem is, according to EXPLAIN, it still wants to do a sequential
scan and not use the index. Any pointers as to why?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CommitFest 2009-07: Remaining Patches

2009-08-13 Thread Robert Haas
OK, we have the following patches remaining for CommitFest 2009-07...

(1) Named and mixed notation for PL.  Tom left this one as Waiting on
Author because he didn't have much else left to work on, just in case
Pavel could rework it in time.  I'll move it to Returned with Feedback
after tomorrow if it's not resubmitted before then.
(2) ECPG dynamic cursor, SQLDA support.  I think we're still waiting
on Michael Meskes to review this one.
(3) query cancel issues in dblink.  Joe Conway is planning to review
this, but not until this weekend.
(4) plpythonu datatype conversion improvements.  Peter Eisentraut said
that was in progress as of 7/24, and I've seen a few PL/python related
commits go by, I think, but not this one, so I guess this one is still
waiting on Peter.
(5, 6) dependencies for generated header files, autogenerating headers
 bki stuff - Tom doesn't seem to think any of this is moving in the
right direction, I believe Alvaro likes it, and I think Peter is
skeptical as well but I'm not totally sure.  It might be nice to have
a bit more discussion to figure out what WOULD be a good way to move
forward with this project.  What is good?  What is bad?  What is ugly?
 But the discussion seems to have trailed off so maybe I should just
move these to Rejected and give up.
(7) Prefix support for synonym dictionary - Just waiting on Oleg to
commit this one, I think.
(8) Filtering dictionary support and unaccent dictionary - Needs some
changes to the docs to handle high-bit characters, but other than that
all I think all objections that have been raised have been addressed,
so I think this is also ready to commit once the doc issues are
addressed.

Comments welcome.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Wisconsin benchmark

2009-08-13 Thread Jeff Janes
The Wisconsin Benchmark in src/test/bench is broken, probably since 8.2.

Attached is a tested patch that fixes it.  However, it might be better
to just remove src/test/bench.  The benchmark is quite useless,
because it is single user test that runs in the stand
alone mode, and because it is laughably small, taking just a couple
seconds on a 6 year old not-all-that good machine.  And it doesn't
seem to be all that faithful to the Wisconsin Benchmark, as the string
fields are only 6 characters rather than 42 (I think) of the
benchmark.


Cheers,

Jeff


WISC-patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of GROUPING SETS (T431: Extended grouping capabilities)

2009-08-13 Thread Pavel Stehule
2009/8/14 Олег Царев zabiva...@gmail.com:
 2009/8/14 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 2009/8/13 Hitoshi Harada umi.tan...@gmail.com:
 2009/8/14 Pavel Stehule pavel.steh...@gmail.com:
 I prefered using CTE, because this way was the most short to small
 bugs less prototype - with full functionality.

 You could make it by query rewriting, but as you say the best cleanest
 way is total refactoring of existing nodeAgg. How easy to implement is
 not convincing.


 I agree. Simply I am not have time and force do it. I would to
 concentrate on finishing some plpgsql issues, and then I have to do
 some other things than PostgreSQL. There are fully functional
 prototype and everybody is welcome to continue in this work.


 I see your situation. Actually your prototype is good shape to be
 discussed in both ways. But since you've been focusing on this feature
 it'd be better if you keep your eyes on this.

 So, Oleg, do you continue on this?


 Regards,


 --
 Hitoshi Harada


 I'd imagine such like:

 select a, b, count(*) from x group by rollup(a, b);

 PerGroup all = init_agg(), a = init_agg(), ab = init_agg();
 while(row = fetch()){
  if(group_is_changed(ab, row)){
    result_ab = finalize_agg(ab);
    ab = init_agg();
  }
  if(group_is_changed(a, row)){
    result_a = finalize_agg(a);
    a = init_agg();
  }
  advance_agg(all, row);
  advance_agg(a, row);
  advance_agg(ab, row);
 }
 result_all = finalize_agg(all);
 Fun =) My implementation of rollup in DBMS qd work as your imagine there! =)
 Also, multiply sort of source we take for CUBE implementation, but
 this hard for support (sort in group by - it's bloat).
 As result we have merge implementation of group by, rollup, and window
 functions with some common code - it's way for grouping of source,
 Hash implementation group xxx on different hash-tables (with different
 keys) it's very expensive (require many memory for keys).
 I hope continue my work, after end of time trouble on work =( (bad
 TPC-H perfomance)


I thing, so you are afraid too much about memory. Look on current
postgres. Any hash grouping is faster than sort grouping. Try and see.
PostgreSQL isn't embeded database. So there are not main goal an using
less memory. The goal is has features with clean, readable and
maintainable source code.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2009-07: Remaining Patches

2009-08-13 Thread Pavel Stehule
2009/8/14 Robert Haas robertmh...@gmail.com:
 OK, we have the following patches remaining for CommitFest 2009-07...

 (1) Named and mixed notation for PL.  Tom left this one as Waiting on
 Author because he didn't have much else left to work on, just in case
 Pavel could rework it in time.  I'll move it to Returned with Feedback
 after tomorrow if it's not resubmitted before then.

I'll work on it this night. I would to respect some Tom comments.

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers