date:20060203

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Richard Huxton


Josh Berkus wrote:

Mark,


Even though they run on the same machine, run the same version of the
software, and are used by the same applications, they have NO
interoperability. For now, lets just accept that they need to be on
separate physical clusters because some need to be able to started and
stopped while others need to remain running, there are other reasons,
but one reason will suffice for the discussion.


For an immediate solution to what you are encountering, have you looked at 
pgPool?


I agree with Josh - pgpool sounds like the place to start with this.

That's got to be the easiest place to add some sort of listall/switch 
todb functionality. It also means you're not *forced* to have only one 
version of PG, or have them all on the same machine.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] look up tables while parsing queries

2006-02-03 Thread andrew

Hi

I am modifying the source code. I want to look up some information
from some tables while parsing the queries. What functions I can use
to look up tables? btw I am using version 7.3. Thanks.

--
andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] latin1 unicode conversion errors

2006-02-03 Thread Kris Jurka



Why is latin1 special in its conversion from unconvertible unicode data? 
Other latin character sets add a warning, but latin1 errors out.


jurka=# create database utf8 with encoding ='utf8';
CREATE DATABASE
jurka=# \c utf8
You are now connected to database utf8.
utf8=# create table t(a text);
CREATE TABLE
utf8=# insert into t values ('\346\231\243');
INSERT 0 1
utf8=# set client_encoding = 'latin2';
SET
utf8=# select * from t;
WARNING:  ignoring unconvertible UTF-8 character 0xe699a3
 a
---

(1 row)

utf8=# set client_encoding = 'latin1';
SET
utf8=# select * from t;
ERROR:  could not convert UTF8 character 0x00e6 to ISO8859-1

Kris Jurka

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] latin1 unicode conversion errors

2006-02-03 Thread Bruce Momjian


My guess is that it was coded by someone different and needs to be made
consistent.

---

Kris Jurka wrote:
 
 Why is latin1 special in its conversion from unconvertible unicode data? 
 Other latin character sets add a warning, but latin1 errors out.
 
 jurka=# create database utf8 with encoding ='utf8';
 CREATE DATABASE
 jurka=# \c utf8
 You are now connected to database utf8.
 utf8=# create table t(a text);
 CREATE TABLE
 utf8=# insert into t values ('\346\231\243');
 INSERT 0 1
 utf8=# set client_encoding = 'latin2';
 SET
 utf8=# select * from t;
 WARNING:  ignoring unconvertible UTF-8 character 0xe699a3
   a
 ---
 
 (1 row)
 
 utf8=# set client_encoding = 'latin1';
 SET
 utf8=# select * from t;
 ERROR:  could not convert UTF8 character 0x00e6 to ISO8859-1
 
 Kris Jurka
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Need Help In arrays

2006-02-03 Thread Salman Razzaq

Dear Group Members,
Hi,
 I found it difficult to use
'construct_md_array' as i want to construct an
array with NULL elements included in it. How do i do that?.
 
   Thankyou.

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Mark Woodward

 Mark Woodward schrieb:
 ...
 Unless you can tell me how to insert live data and indexes to a cluster
 without having to reload the data and recreate the indexes, then I
 hardly
 think I am misinformed. The ad hominem attack wasn't nessisary.

 I see you had a usecase for something like pg_diff and pg_patch ;)
 ...
 If no one sees a way to manage multiple physical database clusters as
 one
 logical cluster as something worth doing, then so be it. I have a
 practical example of a valid reason how this would make PostgreSQL
 easier
 to work with. Yes there are work arounds. Yes it is not currently
 unworkable.

 I dont see your problem, really ;)

 1) if you have very big and very workloaded databases, you often have
 them on different physically boxes anyway
 2) you can run any number of postmasters on the same box - just put
them to listen on different ip:port.

 Now to the management - you say cddb and geodb are managed off host.
 So they are not managed on the life server and so you dont need to
 switch your psql console to them.

 And yeah, its really not a problem, to quit psql and connect
 to a different server anyway :-)

 If you dont like to type -p otherport, you can either create
 aliases with all the arguments or use something like pgadmin3
 which enables you to easy switch from database to database,
 from host to host as you like.

 Now is there any usecase I have missed which you still would
 like to have addressed?

I don't, as it happens, have these databases on different machines, but
come to think about it, maybe it doesn't matter.

The port aspect is troubling, it isn't really self documenting. The
application isn't psql, the applications are custom code written in PHP
and C/C++.

Like I said, in this thread of posts, yes there are ways of doing this,
and I've been doing it for years. It is just one of the rough eges that I
think could be smoother.

(in php)
pg_connect(dbname=geo host=dbserver);

Could connect and query the dbserver, if the db is not on it, connect to a
database of known servers, find geo, and use that information to connect.
It sounds like a simple thing, for sure, but to be useful, there needs to
be buy in from the group otherwise it is just some esoteric hack.

The point is, that I have been working with this sort of use case for a
number of years, and being able to represent multiple physical databases
as one logical db server would make life easier. It was a brainstorm I had
while I was setting this sort of system for the [n]th time.

For my part, I have tried to maintain my own change list for PostgreSQL in
the past, but it is a pain. The main source changes too frequently to keep
up and in the end is just another project to maintain.

Using the /etc/hosts file or DNS to maintain host locations for is a
fairly common and well known practice, but there is no such mechanism for
ports. The problem now becomes a code issue, not a system administration
issue.

If one writes the code to their website to use a generic host name, say,
dbserver, then one can easily test system changes locally and push the
code to a live site. The only difference is the host name. When a port is
involved, there is no systemic way to represent that to the operating
system, and must therefor be part of the code. As part of the code, it
must reside in a place where code has access, and must NOT be pushed with
the rest of the site.

Having some mechanism to deal with this would be cleaner IMHO.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Doug McNaught

Mark Woodward [EMAIL PROTECTED] writes:

 The point is, that I have been working with this sort of use case for a
 number of years, and being able to represent multiple physical databases
 as one logical db server would make life easier. It was a brainstorm I had
 while I was setting this sort of system for the [n]th time.

It sounds like all that would be needed is a kind of smart
proxy--has a list of database clusters on the machine and the
databases they contain, and speaks enough of the protocol to recognize
the startup packet and reroute it internally to the right cluster.
I've heard 'pgpool' mentioned here; from a quick look at the docs it
looks similar but not quite what you want.

So your databases would listen on 5433, 5434, etc and the proxy would
listen on 5432 and route everything properly.  If a particular cluster
is not up, the proxy could just error out the connection.

Hmm, that'd be fun to write if I ever find the time...

-Doug

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 02:09, Tino Wildenhain wrote:


Well if the view does not suit your needs, why dont you use an
set returnung function instead? Inside it you can do all the magic
you want and still use it similar to a table or view.


That's what I'm currently doing (as explained in the first message in  
the thread). But the function is a black box to the planner when  
the query is executed -- I'd like the planner to be able to combine  
the query inside the function with the outer calling query and plan  
it as one big query. Like it does with views. Thus, views with  
arguments.


We're certainly not deficient in this area (set-returning functions  
fill the need quite well), but a feature like this would go even  
further in terms of ease-of-use and performance.


Benefits of views with arguments versus functions:

* Better query execution performance because the planner can plan the  
whole query (rewriting the original query to replace references to  
the view with the view's definition -- this is how views work today)


* PostgreSQL-tracked dependancies: views create dependencies on the  
relations they reference -- functions do not


* Don't have to manually maintain a composite type for the return value

Basically, better performance and easier administration.

Thanks!

- Chris


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Martijn van Oosterhout

On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
 Using the /etc/hosts file or DNS to maintain host locations for is a
 fairly common and well known practice, but there is no such mechanism for
 ports. The problem now becomes a code issue, not a system administration
 issue.

Actually, there is, it's in /etc/services and the functions are
getservbyname and getservbyport. I wonder if it'd be possible to have
psql use this if you put a string in the port part of the connect
string.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Martijn van Oosterhout

On Fri, Feb 03, 2006 at 08:33:23AM -0500, Chris Campbell wrote:
 On Feb 3, 2006, at 02:09, Tino Wildenhain wrote:
 
 Well if the view does not suit your needs, why dont you use an
 set returnung function instead? Inside it you can do all the magic
 you want and still use it similar to a table or view.
 
 That's what I'm currently doing (as explained in the first message in  
 the thread). But the function is a black box to the planner when  
 the query is executed -- I'd like the planner to be able to combine  
 the query inside the function with the outer calling query and plan  
 it as one big query. Like it does with views. Thus, views with  
 arguments.
 
 We're certainly not deficient in this area (set-returning functions  
 fill the need quite well), but a feature like this would go even  
 further in terms of ease-of-use and performance.

Hmm, we actually do inline SQL functions under certain situations, but
only for simple queries (see inline_function in
optimizer/util/clauses.c). One approach would be to expand that
function to inline more complicated things.

 Benefits of views with arguments versus functions:
 
 * Better query execution performance because the planner can plan the  
 whole query (rewriting the original query to replace references to  
 the view with the view's definition -- this is how views work today)

Well, the inlining would acheive the same effect.

 * PostgreSQL-tracked dependancies: views create dependencies on the  
 relations they reference -- functions do not

Technically a bug. We should be tracking dependancies for functions
anyway.

 * Don't have to manually maintain a composite type for the return value

This is a good point. Though with syntactic sugar you could work around
this too...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-02-03 Thread Bruce Momjian


Where are we on this patch?

I am thinking it should be in the most logical place, rather than in
pg_ctl.  One call isn't a big deal, especially if you can run the
postmaster without using pg_ctl.

---

Magnus Hagander wrote:
   If the job object code is moved to the postmaster, it'll 
  work when not 
   running as a service as well.
  
  I'd just as soon keep all that Windows-specific cruft in pg_ctl.
  So I think the way you've got it set up is fine.
 
 Well, it'd all be localised to the backend/port/win32 directory of
 course, except for a single call - where there is already windows
 cruft.
 
 But. I see your point. And the  vast majority of production
 installations run as service anyway. So I won't spend any time making
 those changes, I'll leave what's on -patches now.
 
 //Magnus
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 08:50, Martijn van Oosterhout wrote:


Hmm, we actually do inline SQL functions under certain situations, but
only for simple queries (see inline_function in
optimizer/util/clauses.c). One approach would be to expand that
function to inline more complicated things.



* Better query execution performance because the planner can plan the
whole query (rewriting the original query to replace references to
the view with the view's definition -- this is how views work today)


Well, the inlining would acheive the same effect.


So you think approaching it from the beefing up functions side  
would be better than the beefing up views side?



* PostgreSQL-tracked dependancies: views create dependencies on the
relations they reference -- functions do not


Technically a bug. We should be tracking dependancies for functions
anyway.


Well, it's impossible to track dependencies for all functions, since  
they're just strings (or compiled code in shared libraries) until  
they're executed. But maybe SQL language functions could be special- 
cased? Do you think it would be easier to add dependancy-tracking for  
functions, or would it be easier to implement this functionality  
using the more-restrictive-language but better-dependency-tracking  
view system? When you add dependencies for things that didn't have  
dependencies before (like non-SQL functions), you create all sorts of  
backwards-compatibility problems due to the ordering that things need  
to be dumped and created, and circular dependancies.


For example, this works:

CREATE FUNCTION foo(INTEGER) RETURNS INTEGER AS 'BEGIN RETURN bar 
($1-1); END;' LANGUAGE plpgsql;


CREATE FUNCTION bar(INTEGER) RETURNS INTEGER AS 'BEGIN IF $1  0  
THEN RETURN $1; ELSE RETURN foo($1); END IF; END;' LANGUAGE plpgsql;


But it wouldn't work if PostgreSQL tracked and enforced dependancies.  
But it could probably be done with SQL-language functions only. I  
don't know if we'd want to add dependancy tracking for functions if  
it only works for SQL-language functions, though.


This is a good point. Though with syntactic sugar you could work  
around

this too...


Basically, how views do it? :) By auto-creating a table with the  
proper columns (for a function, that would be an auto-created type).


I'm looking for a function/view hybrid, taking features from each. It  
seems to me that views have most of the features I want (only missing  
the ability to pass arguments), so it's a shorter distance to the  
goal than by starting with functions.


Thanks!

- Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Need Help In arrays

2006-02-03 Thread Alvaro Herrera

Salman Razzaq wrote:

 I found it difficult to use 'construct_md_array' as  i want  to
 construct an  array with NULL elements included in it. How do i do that?.

Keep in mind that you can't do that in released versions (8.1 and
earlier); it's only possible in the current development branch.  If you
are using 8.2dev, see the code comment -- it's explained there.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Mark Woodward

 Mark Woodward [EMAIL PROTECTED] writes:

 The point is, that I have been working with this sort of use case for
 a
 number of years, and being able to represent multiple physical databases
 as one logical db server would make life easier. It was a brainstorm I
 had
 while I was setting this sort of system for the [n]th time.

 It sounds like all that would be needed is a kind of smart
 proxy--has a list of database clusters on the machine and the
 databases they contain, and speaks enough of the protocol to recognize
 the startup packet and reroute it internally to the right cluster.
 I've heard 'pgpool' mentioned here; from a quick look at the docs it
 looks similar but not quite what you want.

 So your databases would listen on 5433, 5434, etc and the proxy would
 listen on 5432 and route everything properly.  If a particular cluster
 is not up, the proxy could just error out the connection.

 Hmm, that'd be fun to write if I ever find the time...

It is similar to a proxy, yes, but that is just part of it. The setup and
running of these systems should all be managed.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Peter Eisentraut

Mark Woodward wrote:
 Oh come on, misinformed? is that really called for?

Claiming that all databases share the same system tables is misinformed, 
with no judgement passed.

 The street database is typically generated and QAed in the lab. It is
 then uploaded to the server. It has many millions of rows and about a
 half dozen indexes. To dump and reload takes almost a day.

There is work happening on speeding up bulk loads.

 Unless you can tell me how to insert live data and indexes to a
 cluster without having to reload the data and recreate the indexes,

I think this sort of thing can be worked on.  VACUUM FREEZE and some 
tool support could make this happen.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Martijn van Oosterhout

On Fri, Feb 03, 2006 at 09:18:51AM -0500, Chris Campbell wrote:
 So you think approaching it from the beefing up functions side  
 would be better than the beefing up views side?

Well yes, I think you're underestimating the issues with trying to
extend views.

 Technically a bug. We should be tracking dependancies for functions
 anyway.
 
 Well, it's impossible to track dependencies for all functions, since  
 they're just strings (or compiled code in shared libraries) until  

snip

Sure, for most procedural languages you can't do much. But we do do
syntax checking already and checking that the necessary functions exist
can be considered part of that. It' s not terribly important though.

 Basically, how views do it? :) By auto-creating a table with the  
 proper columns (for a function, that would be an auto-created type).
 
 I'm looking for a function/view hybrid, taking features from each. It  
 seems to me that views have most of the features I want (only missing  
 the ability to pass arguments), so it's a shorter distance to the  
 goal than by starting with functions.

Ok, here's the deal. A view is nothing more than a RULE. Creating a
view does this automatically:

CREATE RULE blah AS ON SELECT TO myview DO INSTEAD select statement;

Now, say you wanted to add parameters to this, would you restrict it to
SELECT rules, what about UPDATE or DELETE rules?

UPDATE myview(3,4) SET ...;

The other issue is that currently you can tell from looking at a
statement whether something is a function or a table (is it followed by
an open parenthesis?). With the above change you can't anymore, which
might mean you can't have functions and tables with the same names
because they'd be ambiguous.

On the whole, I think allowing the server to inline SRFs would be a far
better way to go...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Doug McNaught

Mark Woodward [EMAIL PROTECTED] writes:

 It is similar to a proxy, yes, but that is just part of it. The setup and
 running of these systems should all be managed.

All that requires is some scripts that wrap pg_ctl and bring the right
instances up and down, perhaps with a web interface on top of them.  I
don't see any need to put that functionality in the proxy.

-Doug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-02-03 Thread Magnus Hagander

You'll still need to run the postmaster frmo pg_ctl to get the run as
admin part. The only part that could be moved is the Job Object for
management. And you're normally not going to need that one when you're
not running as a service. Maybe sometimes, but I doubt it's worth it. If
we could solve the run-as-admin inside the postmaster it might be
different, but AFAICS we can't.

//Magnus
 

 -Original Message-
 From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
 
 
 Where are we on this patch?
 
 I am thinking it should be in the most logical place, rather 
 than in pg_ctl.  One call isn't a big deal, especially if you 
 can run the postmaster without using pg_ctl.
 
 --
 -
 
 Magnus Hagander wrote:
If the job object code is moved to the postmaster, it'll
   work when not
running as a service as well.
   
   I'd just as soon keep all that Windows-specific cruft in pg_ctl.
   So I think the way you've got it set up is fine.
  
  Well, it'd all be localised to the backend/port/win32 directory of 
  course, except for a single call - where there is already windows 
  cruft.
  
  But. I see your point. And the  vast majority of production 
  installations run as service anyway. So I won't spend any 
 time making 
  those changes, I'll leave what's on -patches now.
  
  //Magnus
  
  ---(end of 
  broadcast)---
  TIP 5: don't forget to increase your free space map settings
  
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, 
 Pennsylvania 19073
 

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-02-03 Thread Tom Lane

Magnus Hagander [EMAIL PROTECTED] writes:
 You'll still need to run the postmaster frmo pg_ctl to get the run as
 admin part. The only part that could be moved is the Job Object for
 management. And you're normally not going to need that one when you're
 not running as a service. Maybe sometimes, but I doubt it's worth it. If
 we could solve the run-as-admin inside the postmaster it might be
 different, but AFAICS we can't.

Yeah, I thought we'd agreed that the patch was fine as-is.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 10:25, Martijn van Oosterhout wrote:


Sure, for most procedural languages you can't do much. But we do do
syntax checking already and checking that the necessary functions  
exist

can be considered part of that. It' s not terribly important though.


Dependancy tracking needs to be more than a best effort. If you  
can't do it accurately and completely, then I don't think it's worth  
doing at all.


But I agree with you that syntax checking is probably sufficient. We  
don't need true dependancy tracking for functions.


The only failure case (where syntax checking won't help you) is  
deleting (or renaming, or modifying) a table that a function was  
using. If you were to run or re-define the function, you'd learn  
about the missing (or renamed, or modified) table, whereas the  
dependancy-tracking system would prevent you from making the changes  
to the referenced table in the first place.



Ok, here's the deal. A view is nothing more than a RULE. Creating a
view does this automatically


Technically, it's a table and a rule, both of which are created by  
the CREATE VIEW command. We were talking about syntactic sugar, and  
CREATE VIEW is syntactic sugar for doing a CREATE TABLE and a CREATE  
RULE. That was my comparison. I'm aware of how views work. Here's the  
deal: I want to beef up rules versus beefing up functions. Maybe  
that's not the way to go; I'm enjoying this discussion and your  
insights.



CREATE RULE blah AS ON SELECT TO myview DO INSTEAD select statement;

Now, say you wanted to add parameters to this, would you restrict  
it to

SELECT rules, what about UPDATE or DELETE rules?


I don't see a huge use case for anything but SELECT rules, but I  
think it could be extended to any rule type. Maybe the CREATE RULE  
syntax could be something like:


CREATE RULE blah AS ON SELECT(INTEGER, INTEGER, DATE) TO myview  
DO INSTEAD SELECT * FROM sale WHERE sale_date = $3;



The other issue is that currently you can tell from looking at a
statement whether something is a function or a table (is it  
followed by

an open parenthesis?). With the above change you can't anymore, which
might mean you can't have functions and tables with the same names
because they'd be ambiguous.


Right. I said that my example syntax was deficient in this regard in  
the first message in this thread. And I solicited ideas for a better  
(unambiguous) syntax. I'm sure we would be able to come up with  
something. Maybe using square brackets instead of parentheses? Curly  
braces? myview-(1, 2, 3) notation? Since views are tables (when  
parsing the query, at least) we'd have to allow this syntax for any  
table reference, but throw an error (or silently discard the  
arguments) if the table didn't have a rule matching the argument types?


On the whole, I think allowing the server to inline SRFs would be a  
far

better way to go...


Maybe, but the highly-structured view definition syntax and  
everything that comes with it (dependancy tracking primarily) is so  
tasty. I think a little grammar hacking and a couple extra columns in  
pg_rewrite (nargs and argtypes) would get us most of the way there.


I would much rather put more stringent requirements on the programmer  
when defining his query (like a view), versus letting him submit any  
old string as a function (like a function). The database can do so  
much more when it's able to work with a better representation of the  
computation.


At the core, I want query rewriting with arguments. That sounds like  
a better fit for views/rules than functions, so that's why I keep  
coming back to it.


Thanks!

- Chris


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Tom Lane

Chris Campbell [EMAIL PROTECTED] writes:
 I want to beef up rules versus beefing up functions.

Martijn didn't present a very convincing argument why this is a bad
idea, but I agree with him that it is.  The problem to me is that a
view with arguments is fundamentally wrong --- a view is a virtual
table and there is no such thing as a table with arguments.  The whole
notion distorts the relational data model beyond recognition.

The SRF concept captures what you want a whole lot better.  If the
implementation isn't up to snuff, we should improve it, not warp other
pieces of the system.

Martijn mentioned the idea of inlining SQL functions that return sets
--- this is something I've toyed with too, but not got round to looking
at seriously.  AFAICS it would accomplish everything that you could do
with parameters in ON SELECT rules, considering the existing
restrictions on what can be in an ON SELECT rule.  And it wouldn't
require any new concepts at all, just a few(?) pages of code.

As for the dependency issue, one man's bug is another man's feature.
I think the fact that we don't track the internal dependencies of
functions is not all bad.  We've certainly seen plenty of complaints
about how you can't easily change tables that a view is depending on
because the view dependencies block it...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Chris Browne

[EMAIL PROTECTED] (Mark Woodward) writes:
 The port aspect is troubling, it isn't really self
 documenting. The application isn't psql, the applications are custom
 code written in PHP and C/C++.

Nonsense.  See /etc/services

 Using the /etc/hosts file or DNS to maintain host locations for is
 a fairly common and well known practice, but there is no such
 mechanism for ports. The problem now becomes a code issue, not a
 system administration issue.

Nonsense.  See /etc/services

 If one writes the code to their website to use a generic host name,
 say, dbserver, then one can easily test system changes locally and
 push the code to a live site. The only difference is the host
 name. When a port is involved, there is no systemic way to represent
 that to the operating system, and must therefor be part of the
 code. As part of the code, it must reside in a place where code has
 access, and must NOT be pushed with the rest of the site.

 Having some mechanism to deal with this would be cleaner IMHO.

I'm sure it would be, that's why there has been one, which has been in
use since the issuance of RFC 349 by Jon Postel back in May of 1972.
The mechanism is nearly 34 years old.

Note that RFCs are no longer used to issue port listings, as per RFC
3232, back in 2002.  Now, IANA manages a repository of standard port
numbers, commonly populated into /etc/services.

  http://www.iana.org/assignments/port-numbers

For customizations, see:

% man 5 services
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org)
http://www.ntlug.org/~cbbrowne/sgml.html
Motto for a research laboratory: What we work on today, others will
first think of tomorrow. -- Alan J. Perlis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Tom Lane

Chris Campbell [EMAIL PROTECTED] writes:
 True, as long as there's a hook to do the inlining/rewriting before  
 the query's planned. I guess we can see function calls at the parse  
 stage, check to see if they're SQL functions or not, grab the prosrc,  
 do the substitution, then re-parse?

pull_up_subqueries in prepjointree.c would be the appropriate place
I think: if it's an RTE_FUNCTION RTE, look to see if function is SQL
and has the other needed properties, if so replace it by an RTE_SUBQUERY
RTE with the correct subquery, then recurse to try to flatten the
subquery.  (Note: I'm in the middle of hacking that code to flatten
UNION subqueries, so you might want to wait till I commit before
starting on a patch ;-))

 I guess I can live without the dependancy tracking. I can always dump  
 and reload my database to re-parse all the functions. Maybe we could  
 have a RELOAD FUNCTION command that would just re-parse an existing  
 function, so I don't have to dump and reload?

Hm?  I don't understand why you think this is needed.

 What about auto-creating a composite type for the function's return  
 type based on the query definition?

Can't get excited about this --- although I don't have any fundamental
objection either.  Choosing a name for such a type might be a bit of
a problem (I don't think you can just use the function name, as that
would preclude overloading).

 Maybe an extension to CREATE FUNCTION as a shorthand for set- 
 returning SQL functions?

It would be surprising if this form of CREATE FUNCTION defaulted to
assuming SETOF when other forms don't, so I don't like the proposal
as written.  Also the syntax you suggest has noplace to put function
attributes like VOLATILE.

Note that you can already do

regression=# create function fooey(int, out k1 int, out k2 int) returns setof 
record as
regression-# $$ select unique1, unique2 from tenk1 where thousand = $1 $$ 
language sql;
CREATE FUNCTION
regression=# select * from fooey(44);
  k1  |  k2
--+--
 7044 |  562
 5044 |  692
 1044 |  789
 4044 | 1875
 3044 | 3649
 2044 | 4063
 8044 | 6124
 6044 | 6451
 9044 | 6503
   44 | 7059
(10 rows)

regression=#

I'm not convinced that the incremental advantage of not having to write
out the function output column types is worth introducing an
inconsistent variant of CREATE FUNCTION.  Some indeed would argue that
that's not an advantage at all ;-)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Rick Gigger



On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:


On Feb 3, 2006, at 08:05, Mark Woodward wrote:

Using the /etc/hosts file or DNS to maintain host locations for  
is a
fairly common and well known practice, but there is no such  
mechanism for
ports. The problem now becomes a code issue, not a system  
administration

issue.


What if you assigned multiple IPs to a machine, then used ipfw (or  
something) to forward connections to port 5432 for each IP to the  
proper IP and port?


If he had multiple ips couldn't he just make them all listen only on  
one specific ip (instead of '*') and just use the default port?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 12:43, Rick Gigger wrote:

If he had multiple ips couldn't he just make them all listen only  
on one specific ip (instead of '*') and just use the default port?


Yeah, but the main idea here is that you could use ipfw to forward  
connections *to other hosts* if you wanted to. Basically working like  
a proxy.


- Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 12:27, Tom Lane wrote:


I guess I can live without the dependancy tracking. I can always dump
and reload my database to re-parse all the functions. Maybe we could
have a RELOAD FUNCTION command that would just re-parse an existing
function, so I don't have to dump and reload?


Hm?  I don't understand why you think this is needed.


Consider function foo() that references table bar. When you CREATE  
FUNCTION foo() ... AS 'SELECT ... FROM bar' you get an error message  
if bar doesn't exist. If it does exist, CREATE FUNCTION succeeds.


If you later DROP bar, you're not informed that function foo() was  
referencing it. You only find that out if you redefine foo() (using  
CREATE OR REPLACE FUNCTION and passing in the same definition, which  
fails) or if you try to run foo() (and the query fails).


If functions had true dependency tracking, then you couldn't DROP bar  
due to foo()'s dependency on it, unless you did a DROP CASCADE and  
were alerted that foo() was dropped as well.


I'm fine with those limitations. I can confirm that all of my  
functions are not referencing tables that don't exist by doing a  
CREATE OR REPLACE FUNCTION to reload each function. A pg_dump/ 
pg_restore would accomplish this, but it would be nice to have a  
RELOAD FUNCTION (or REPARSE? or VERIFY?) command that would  
just re-parse the function's source code (like CREATE FUNCTION does)  
and spit out errors if the function is referencing relations that  
don't exist. Just as a way to confirm that the table modification I  
just performed didn't break any functions. On-demand dependency  
checking, in a way.



Note that you can already do

regression=# create function fooey(int, out k1 int, out k2 int)  
returns setof record as
regression-# $$ select unique1, unique2 from tenk1 where thousand =  
$1 $$ language sql;

CREATE FUNCTION
regression=# select * from fooey(44);
  k1  |  k2
--+--
 7044 |  562
 5044 |  692
 1044 |  789
 4044 | 1875
 3044 | 3649
 2044 | 4063
 8044 | 6124
 6044 | 6451
 9044 | 6503
   44 | 7059
(10 rows)

regression=#


Learn something new every day. I'm still using 7.4 for most of my day  
job, and I can't do this without supplying a column definition list:


ERROR:  a column definition list is required for functions returning  
record


I hereby withdraw my proposal for CREATE SQL FUNCTION.

Thanks!

- Chris



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Tom Lane

Chris Campbell [EMAIL PROTECTED] writes:
 I'm fine with those limitations. I can confirm that all of my  
 functions are not referencing tables that don't exist by doing a  
 CREATE OR REPLACE FUNCTION to reload each function. A pg_dump/ 
 pg_restore would accomplish this, but it would be nice to have a  
 RELOAD FUNCTION (or REPARSE? or VERIFY?) command that would  
 just re-parse the function's source code (like CREATE FUNCTION does)  
 and spit out errors if the function is referencing relations that  
 don't exist.

This is putting way too much trust in the behavior of a
PL-language-specific verifier function.  Anyway, you can do what you
want today:
select fmgr_sql_validator(oid) from pg_proc where prolang = 14;
(Generalizing this to work for any language is left as an exercise
for the reader...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Josh Berkus

Mark, all:

  So your databases would listen on 5433, 5434, etc and the proxy would
  listen on 5432 and route everything properly.  If a particular cluster
  is not up, the proxy could just error out the connection.
 
  Hmm, that'd be fun to write if I ever find the time...

 It is similar to a proxy, yes, but that is just part of it. The setup
 and running of these systems should all be managed.

Per my earlier comment, this really seems like an obvious extension of 
pgPool, or Sequoia if you're a java geek.  No need to re-invent the wheel.

In terms of the PostgreSQL Core, though, Mark, it sounds like you're 
treating the symptoms and not the causes.   What you really need is a way 
to load a large database very quickly (in binary form or otherwise) 
without downing the cluster.  This is a generally desired feature that has 
been discussed several times on this list, and you could get general 
agreement on easily.

The feature you proposed is a way to make your idiosyncratic setup easier 
to manage, but doesn't apply to anyone else's problems on this list, so 
you're going to have a hard time drumming up enthusiasm.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Josh Berkus

Tom,

 As for the dependency issue, one man's bug is another man's feature.
 I think the fact that we don't track the internal dependencies of
 functions is not all bad.  We've certainly seen plenty of complaints
 about how you can't easily change tables that a view is depending on
 because the view dependencies block it...

I'd agree with this.   I write about 150,000 lines of function code a year, 
and if I had to rebuild all of the cascading functions every time I change 
a table they way I have to with views, it would probably add 20% to my 
overall application development time.

BTW, the other thing that we're still TODOing on SRFs (as far as I know) is 
finding ways to change the row estimate for an SRF.  It's still a flat 
1000 in the code, which can cause a lot of bad query plans.  I proposed a 
year ago that, as a first step, we allow the function owner to assign a 
static estimate variable to the function (i.e. average rows returned = 
5').  This doesn't solve the whole problem of SRF estimates but it would 
be a significant step forwards in being able to use them in queries.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-02-03 Thread Bruce Momjian

Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  You'll still need to run the postmaster frmo pg_ctl to get the run as
  admin part. The only part that could be moved is the Job Object for
  management. And you're normally not going to need that one when you're
  not running as a service. Maybe sometimes, but I doubt it's worth it. If
  we could solve the run-as-admin inside the postmaster it might be
  different, but AFAICS we can't.
 
 Yeah, I thought we'd agreed that the patch was fine as-is.

Fine. I just wanted to make sure the decions was being made in terms of
logic, rather than Win32 cruft avoidance.  The previous discussion was
not clear on this point.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Mark Dilger


Tom Lane wrote:

Chris Campbell [EMAIL PROTECTED] writes:

True, as long as there's a hook to do the inlining/rewriting before  
the query's planned. I guess we can see function calls at the parse  
stage, check to see if they're SQL functions or not, grab the prosrc,  
do the substitution, then re-parse?



pull_up_subqueries in prepjointree.c would be the appropriate place
I think: if it's an RTE_FUNCTION RTE, look to see if function is SQL
and has the other needed properties, if so replace it by an RTE_SUBQUERY
RTE with the correct subquery, then recurse to try to flatten the
subquery.  (Note: I'm in the middle of hacking that code to flatten
UNION subqueries, so you might want to wait till I commit before
starting on a patch ;-))


If we are talking about inserting the function definition into the query as a 
subquery and then letting the parser treat it as a subquery, then I see no 
reason to use either the existing function or view subsystems.  It sounds more 
like we are discussing a macro language.


  CREATE MACRO foo(bar,baz) AS $$
select a from b where b  bar and b  baz
  $$;

Then when you query

  SELECT * FROM foo(1,7) AS f WHERE f % 7 = 3

you get a macro expansion as such:

  SELECT * FROM (a from b where b  bar and b  baz) AS f WHERE f % 7 = 3

Then whatever optimizations the query planner can manage against a subquery will 
work for macros as well.


Thoughts?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Tom Lane

Josh Berkus josh@agliodbs.com writes:
 BTW, the other thing that we're still TODOing on SRFs (as far as I know) is 
 finding ways to change the row estimate for an SRF.  It's still a flat 
 1000 in the code, which can cause a lot of bad query plans.  I proposed a 
 year ago that, as a first step, we allow the function owner to assign a 
 static estimate variable to the function (i.e. average rows returned = 
 5').  This doesn't solve the whole problem of SRF estimates but it would 
 be a significant step forwards in being able to use them in queries.

The inlining thing would solve that much better, at least for the cases
where the function can be inlined.  I'm not sure how we can improve the
situation for things like looping plpgsql functions --- the function
owner probably can't write down a hard estimate for those either, in
most cases.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Tom Lane

Mark Dilger [EMAIL PROTECTED] writes:
 If we are talking about inserting the function definition into the
 query as a subquery and then letting the parser treat it as a
 subquery, then I see no reason to use either the existing function or
 view subsystems.  It sounds more like we are discussing a macro
 language.

Which is pretty much what a SQL function is already.  I don't see a need
to invent a separate concept.  To the extent that macros have different
semantics than functions (eg, multiple evaluation of arguments) the
differences are generally not improvements IMHO ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Mark Dilger


Josh Berkus wrote:

Tom,



As for the dependency issue, one man's bug is another man's feature.
I think the fact that we don't track the internal dependencies of
functions is not all bad.  We've certainly seen plenty of complaints
about how you can't easily change tables that a view is depending on
because the view dependencies block it...



I'd agree with this.   I write about 150,000 lines of function code a year, 
and if I had to rebuild all of the cascading functions every time I change 
a table they way I have to with views, it would probably add 20% to my 
overall application development time.


BTW, the other thing that we're still TODOing on SRFs (as far as I know) is 
finding ways to change the row estimate for an SRF.  It's still a flat 
1000 in the code, which can cause a lot of bad query plans.  I proposed a 
year ago that, as a first step, we allow the function owner to assign a 
static estimate variable to the function (i.e. average rows returned = 
5').  This doesn't solve the whole problem of SRF estimates but it would 
be a significant step forwards in being able to use them in queries.




This would only seem to work for trivial functions.  Most functions that I write 
are themselves dependent on underlying tables, and without any idea how many 
rows are in the tables, and without any idea of the statistical distribution of 
those rows, I can't really say anything like average rows returned = 5.


What I have wanted for some time is a function pairing system.  For each set 
returning function F() I create, I would have the option of creating a 
statistics function S() which returns a single integer which represents the 
guess of how many rows will be returned.  S() would be called by the planner, 
and the return value of S() would be used to decide the plan.  S() would need 
access to the table statistics information.  I imagine that the system would 
want to prevent S() from running queries, and only allow it to call certain 
defined table statistics functions and some internal math functions, thereby 
avoiding any infinite recursion in the planner.  (If S() ran any queries, those 
queries would go yet again to the planner, and on down the infinite recursion 
you might go.)


Of course, some (possibly most) people could chose not to write an S() for their 
F(), and the default of 1000 rows would continue to be used.  As such, this new 
extension to the system would be backwards compatible to functions which don't 
have an S() defined.


Thoughts?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Josh Berkus

Mark,

 This would only seem to work for trivial functions.  Most functions that
 I write are themselves dependent on underlying tables, and without any
 idea how many rows are in the tables, and without any idea of the
 statistical distribution of those rows, I can't really say anything like
 average rows returned = 5.

 What I have wanted for some time is a function pairing system.  For each
 set returning function F() I create, I would have the option of creating
 a statistics function S() which returns a single integer which
 represents the guess of how many rows will be returned.  S() would be
 called by the planner, and the return value of S() would be used to
 decide the plan.  S() would need access to the table statistics
 information.  I imagine that the system would want to prevent S() from
 running queries, and only allow it to call certain defined table
 statistics functions and some internal math functions, thereby avoiding
 any infinite recursion in the planner.  (If S() ran any queries, those
 queries would go yet again to the planner, and on down the infinite
 recursion you might go.)

 Of course, some (possibly most) people could chose not to write an S()
 for their F(), and the default of 1000 rows would continue to be used. 
 As such, this new extension to the system would be backwards compatible
 to functions which don't have an S() defined.

I think this is a fine idea, and I think I endorsed it the first time.  
However, even a static function returns # would be better than what we 
have now, and I think the S() method could take quite a bit of engineering 
to work out (for example, what if F() is being called in a JOIN or 
correlated subquery?).  So I'm worried that shooting for the S() idea only 
could result in us not doing *anything* for several more versions.

What I'd like to do is implement the constant method for 8.2, and work on 
doing the S() method later on.  Does that make sense?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Jeremy Drake

On Fri, 3 Feb 2006, Josh Berkus wrote:

 The feature you proposed is a way to make your idiosyncratic setup easier
 to manage, but doesn't apply to anyone else's problems on this list, so
 you're going to have a hard time drumming up enthusiasm.

I am somewhat reluctant to interject into this discussion, but the
particular idiosyncratic setup referred to is not the only one where
this may be useful.

The immediate use I thought of was being able to have what appeared to be
multiple databases on the same server with different locale settings,
which cannot be changed post-initdb.  I could see having different
databases in different locales being a useful feature, perhaps in a
wikipedia type setup so that the english, chinese, and arabic wikis could
each provide the correct sort order and other locale-specific properties
while still providing a single logical database server for connection
strings.  This just being the first example I could think of in which such
a setup could be useful.

-- 
In the beginning, I was made.  I didn't ask to be made.  No one consulted
with me or considered my feelings in this matter.  But if it brought some
passing fancy to some lowly humans as they haphazardly pranced their way
through life's mournful jungle, then so be it.
- Marvin the Paranoid Android, From Douglas Adams' Hitchiker's Guide to the
Galaxy Radio Scripts

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Mark Dilger


Tom Lane wrote:

Mark Dilger [EMAIL PROTECTED] writes:


If we are talking about inserting the function definition into the
query as a subquery and then letting the parser treat it as a
subquery, then I see no reason to use either the existing function or
view subsystems.  It sounds more like we are discussing a macro
language.



Which is pretty much what a SQL function is already.  I don't see a need
to invent a separate concept.  To the extent that macros have different
semantics than functions (eg, multiple evaluation of arguments) the
differences are generally not improvements IMHO ...

regards, tom lane


I have numerous times run EXPLAIN ANALYZE on my queries with SQL functions 
embedded and gotten different (far worse) results than if I manually inline the 
function following the macro expansion idea above.  That has led me to wish that 
postgres would inline it for me.  That doesn't prove that the macro idea is 
needed; it might be that the SQL function systems needs more work.  (In fact, I 
haven't done this since 8.0.3, so I'm not sure that 8.1 even does a bad job 
anymore.)


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Mark Dilger


Josh Berkus wrote:

Mark,



This would only seem to work for trivial functions.  Most functions that
I write are themselves dependent on underlying tables, and without any
idea how many rows are in the tables, and without any idea of the
statistical distribution of those rows, I can't really say anything like
average rows returned = 5.

What I have wanted for some time is a function pairing system.  For each
set returning function F() I create, I would have the option of creating
a statistics function S() which returns a single integer which
represents the guess of how many rows will be returned.  S() would be
called by the planner, and the return value of S() would be used to
decide the plan.  S() would need access to the table statistics
information.  I imagine that the system would want to prevent S() from
running queries, and only allow it to call certain defined table
statistics functions and some internal math functions, thereby avoiding
any infinite recursion in the planner.  (If S() ran any queries, those
queries would go yet again to the planner, and on down the infinite
recursion you might go.)

Of course, some (possibly most) people could chose not to write an S()
for their F(), and the default of 1000 rows would continue to be used. 
As such, this new extension to the system would be backwards compatible

to functions which don't have an S() defined.



I think this is a fine idea, and I think I endorsed it the first time.  
However, even a static function returns # would be better than what we 
have now, and I think the S() method could take quite a bit of engineering 
to work out (for example, what if F() is being called in a JOIN or 
correlated subquery?).  So I'm worried that shooting for the S() idea only 
could result in us not doing *anything* for several more versions.


What I'd like to do is implement the constant method for 8.2, and work on 
doing the S() method later on.  Does that make sense?




I have no objections to implementing the constant method sooner than the full 
version.  It might be useful to implement it as a subsyntax to the full version 
though in preparation for later expansion.  For instance, if there is a plstats 
language developed, you could limit the parser for it to just functions like:


CREATE FUNCTION S() RETURNS INTEGER PROVIDES FOR F() AS $$
RETURN 5;
$$ LANGUAGE PLSTATS;

Then the language could be expanded later to allow calls to the table statistics 
functions.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Josh Berkus

Jeremy,

 The immediate use I thought of was being able to have what appeared to
 be multiple databases on the same server with different locale settings,
 which cannot be changed post-initdb. 

Again, this is patching the symtoms instead of going after the cause.  The 
real issue you're trying to address is not being able to set locale per 
database, which is what we really want.

Not that symptomatic cures are out of the question for add-ons, like pgPool 
(and I could see a lot of uses for a pgPool that could obscure the fact 
that it was connecting to multiple servers).  But they aren't the way to 
go for the core code.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Error working with Temporary Sequences in plpgsql in 8.1 (8.0

2006-02-03 Thread Daniel Schuchardt


A nice workaraound because

EXECUTE 'select nextval(''test'')' INTO I;


doesnt work in 8.0 seems to be:

myid:=nextval('stvtrsid_seq'::TEXT);


This seems to work in every case.


Daniel


Jaime Casanova schrieb:

try this way:

CREATE OR REPLACE FUNCTION testseq() RETURNS void AS
$BODY$
BEGIN
 EXECUTE 'CREATE TEMP SEQUENCE test';
 PERFORM testseq1();
 DROP SEQUENCE test;
 RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION testseq() OWNER TO postgres;


CREATE OR REPLACE FUNCTION testseq1() RETURNS void AS
$BODY$
DECLARE I INTEGER;
BEGIN
 EXECUTE 'select nextval(''test'')' INTO I;
raise notice '%', I;
 RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION testseq1() OWNER TO postgres;

SELECT testseq();
SELECT testseq();


is the same problem as with temp tables, you must put their creation,
and in this case even the nextval in an execute...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Tom Lane

Josh Berkus josh@agliodbs.com writes:
 What I'd like to do is implement the constant method for 8.2, and work on 
 doing the S() method later on.  Does that make sense?

I'm not thrilled with putting in a stopgap that we will have to support
forever.  The constant method is *clearly* inadequate for many (probably
most IMHO) practical cases.  Where do you see it being of use?

W.R.T. the estimator function method, the concern about recursion seems
misplaced.  Such an estimator presumably wouldn't invoke the associated
function itself.  I'm more concerned about coming up with a usable API
for such things.  Our existing mechanisms for estimating operator
selectivities require access to internal planner data structures, which
makes it pretty much impossible to write them in anything but C.  We'd
need something cleaner to have a feature I'd want to export for general
use.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Passing arguments to views

2006-02-03 Thread Chris Campbell


On Feb 3, 2006, at 11:21, Tom Lane wrote:


The SRF concept captures what you want a whole lot better.  If the
implementation isn't up to snuff, we should improve it, not warp other
pieces of the system.


Point taken. The rewriting concept is what I'm after; if that can be  
done pre-planning with SQL functions, I'm all for it. I just thought  
that since rules already do rewriting, that's the best thing to start  
building on.



Martijn mentioned the idea of inlining SQL functions that return sets
--- this is something I've toyed with too, but not got round to  
looking

at seriously.  AFAICS it would accomplish everything that you could do
with parameters in ON SELECT rules, considering the existing
restrictions on what can be in an ON SELECT rule.  And it wouldn't
require any new concepts at all, just a few(?) pages of code.


True, as long as there's a hook to do the inlining/rewriting before  
the query's planned. I guess we can see function calls at the parse  
stage, check to see if they're SQL functions or not, grab the prosrc,  
do the substitution, then re-parse?


I guess I can live without the dependancy tracking. I can always dump  
and reload my database to re-parse all the functions. Maybe we could  
have a RELOAD FUNCTION command that would just re-parse an existing  
function, so I don't have to dump and reload?


What about auto-creating a composite type for the function's return  
type based on the query definition? (Like how CREATE VIEW creates an  
appropriate table definition.) Do you see a way for CREATE FUNCTION  
to do that? The problem is that you have to specify a return type in  
CREATE FUNCTION.


Maybe an extension to CREATE FUNCTION as a shorthand for set- 
returning SQL functions? Like:


   CREATE SQL FUNCTION sales_figures(DATE) AS SELECT ... FROM ...  
WHERE sale_date = $1;


It would (1) automatically create a composite type (newtype) for the  
return value, and (2) do a


   CREATE FUNCTION sales_figures(DATE) RETURNS SETOF newtype AS  
'...' LANGUAGE sql.


How much do I have to justify a patch for non-standard RELOAD  
FUNCTION and CREATE SQL FUNCTION commands (as described) in the  
grammar? :)


Thanks!

- Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Josh Berkus

Tom,

  What I'd like to do is implement the constant method for 8.2, and work
  on doing the S() method later on.  Does that make sense?

 I'm not thrilled with putting in a stopgap that we will have to support
 forever.  The constant method is *clearly* inadequate for many (probably
 most IMHO) practical cases.  Where do you see it being of use?

Well, mostly for the real-world use cases where I've run into SRF estimate 
issues, which have mostly been SRFs which return one row.

 W.R.T. the estimator function method, the concern about recursion seems
 misplaced.  Such an estimator presumably wouldn't invoke the associated
 function itself.  

No, but if you're calling the S() estimator in the context of performing a 
join, what do you supply for parameters?

 I'm more concerned about coming up with a usable API 
 for such things.  Our existing mechanisms for estimating operator
 selectivities require access to internal planner data structures, which
 makes it pretty much impossible to write them in anything but C.  We'd
 need something cleaner to have a feature I'd want to export for general
 use.

Yes -- we need to support the simplest case, which is functions that return 
either (a) a fixed number of rows, or (b) a fixed multiple of the number 
of rows passed to the function.  These simple cases should be easy to 
build.  For more complex estimation, I personally don't see a problem with 
forcing people to hack it in C.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Tom Lane

Josh Berkus josh@agliodbs.com writes:
 I'm not thrilled with putting in a stopgap that we will have to support
 forever.  The constant method is *clearly* inadequate for many (probably
 most IMHO) practical cases.  Where do you see it being of use?

 Well, mostly for the real-world use cases where I've run into SRF estimate 
 issues, which have mostly been SRFs which return one row.

Well, if they're certain to return one row, you can just declare them as
not SETOF, no?  Since 8.1 we do get that case right:

regression=# explain select * from cos(0);
   QUERY PLAN

 Function Scan on cos  (cost=0.00..0.01 rows=1 width=8)
(1 row)


 No, but if you're calling the S() estimator in the context of performing a 
 join, what do you supply for parameters?

Exactly my point about the API problem.  I'm not sure that joins matter,
but the function parameters sure do, and those might not be simple constants.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Mark Woodward

 On Feb 3, 2006, at 12:43, Rick Gigger wrote:

 If he had multiple ips couldn't he just make them all listen only
 on one specific ip (instead of '*') and just use the default port?

 Yeah, but the main idea here is that you could use ipfw to forward
 connections *to other hosts* if you wanted to. Basically working like
 a proxy.

I certainly hope that is not the idea, ipfw would be a whole level of
complexity to be avoided.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Mark Woodward


 On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:

 On Feb 3, 2006, at 08:05, Mark Woodward wrote:

 Using the /etc/hosts file or DNS to maintain host locations for
 is a
 fairly common and well known practice, but there is no such
 mechanism for
 ports. The problem now becomes a code issue, not a system
 administration
 issue.

 What if you assigned multiple IPs to a machine, then used ipfw (or
 something) to forward connections to port 5432 for each IP to the
 proper IP and port?

 If he had multiple ips couldn't he just make them all listen only on
 one specific ip (instead of '*') and just use the default port?

That is a good idea, and yes it would work, as do a lot of other
scenarios, but shouldn't PostgreSQL take care of PostgreSQL?



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Mark Dilger


Josh Berkus wrote:

Tom,



What I'd like to do is implement the constant method for 8.2, and work
on doing the S() method later on.  Does that make sense?


I'm not thrilled with putting in a stopgap that we will have to support
forever.  The constant method is *clearly* inadequate for many (probably
most IMHO) practical cases.  Where do you see it being of use?



Well, mostly for the real-world use cases where I've run into SRF estimate 
issues, which have mostly been SRFs which return one row.




W.R.T. the estimator function method, the concern about recursion seems
misplaced.  Such an estimator presumably wouldn't invoke the associated
function itself.  



No, but if you're calling the S() estimator in the context of performing a 
join, what do you supply for parameters?


I've been thinking about this more, and now I don't see why this is an issue. 
When the planner estimates how many rows will be returned from a subquery that 
is being used within a join, it can't know which parameters to use either. 
(Parameters being whatever conditions the subquery will pivot upon which are the 
result of some other part of the execution of the full query.)  So it seems to 
me that function S() is at no more of a disadvantage than the planner.


If I defined a function S(a integer, b integer) which provides an estimate for 
the function F(a integer, b integer), then S(null, null) could be called when 
the planner can't know what a and b are.  S could then still make use of the 
table statistics to provide some sort of estimate.  Of course, this would mean 
that functions S() cannot be defined strict.


I'm more concerned about coming up with a usable API 
for such things.  Our existing mechanisms for estimating operator

selectivities require access to internal planner data structures, which
makes it pretty much impossible to write them in anything but C.  We'd
need something cleaner to have a feature I'd want to export for general
use.



Yes -- we need to support the simplest case, which is functions that return 
either (a) a fixed number of rows, or (b) a fixed multiple of the number 
of rows passed to the function.  These simple cases should be easy to 
build.  For more complex estimation, I personally don't see a problem with 
forcing people to hack it in C.


Could we provide table statistics access functions in whatever higher-level 
language S() is written in, or is there something fundamentally squirrelly about 
the statistics that would make this impossible?


Also, since we haven't nailed down a language for S(), if we allowed any of sql, 
plpgsql, plperl, plpython, etc, then we would need access methods for each, 
which would place a burden on all PLs, right?  That argument isn't strong enough 
to make me lean either way; it's just an observation.




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Function Stats WAS: Passing arguments to views

2006-02-03 Thread Mark Dilger


Mark Dilger wrote:
I've been thinking about this more, and now I don't see why this is an 
issue. When the planner estimates how many rows will be returned from a 
subquery that is being used within a join, it can't know which 
parameters to use either. (Parameters being whatever conditions the 
subquery will pivot upon which are the result of some other part of the 
execution of the full query.)  So it seems to me that function S() is at 
no more of a disadvantage than the planner.


If I defined a function S(a integer, b integer) which provides an 
estimate for the function F(a integer, b integer), then S(null, null) 
could be called when the planner can't know what a and b are.  S could 
then still make use of the table statistics to provide some sort of 
estimate.  Of course, this would mean that functions S() cannot be 
defined strict.


Ok, null probably isn't a good value.  F(null, null) could be the call being 
made, so S(null, null) would mean F is being passed nulls rather than We 
don't know what F's arguments are yet.  The returned estimate might be quite 
different for these two cases.  You could have:


  F(a integer, b integer)
  S(a integer, a_is_known boolean, b integer, b_is_known boolean)

But I'm not fond of the verbosity of doubling the argument list.  Since some 
arguments might be known while others still are not, I don't think a single 
boolean argument all_arguments_are_known is sufficient.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Multiple logical databases

2006-02-03 Thread Bruce Momjian

Mark Woodward wrote:
 
  On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:
 
  On Feb 3, 2006, at 08:05, Mark Woodward wrote:
 
  Using the /etc/hosts file or DNS to maintain host locations for
  is a
  fairly common and well known practice, but there is no such
  mechanism for
  ports. The problem now becomes a code issue, not a system
  administration
  issue.
 
  What if you assigned multiple IPs to a machine, then used ipfw (or
  something) to forward connections to port 5432 for each IP to the
  proper IP and port?
 
  If he had multiple ips couldn't he just make them all listen only on
  one specific ip (instead of '*') and just use the default port?
 
 That is a good idea, and yes it would work, as do a lot of other
 scenarios, but shouldn't PostgreSQL take care of PostgreSQL?

PostgreSQL takes care of PostgreSQL only if it is best at doing it ---
in thise case, it is not.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PORTS] Failed install - libgen.so doesn't exist

2006-02-03 Thread Peter Eisentraut

Tom Lane wrote:
 should be working towards a project policy that AC_CHECK_LIB calls
 shalt not use main, but must name some symbol exported by the
 expected library. If we can't find out what symbols the library is
 expected to provide, it's time to dike it out.

I took a first swing at this and rearranged some of these calls.

A couple of other findings:

nsl -- This might contain socket functions on some older platforms.

ipc, IPC, lc -- unknown

ld -- On AIX at least this seems to be some magic library but doesn't 
have an obvious testable symbol.

compat -- could be anything, obviously

bsd, BSD -- I seem to recall that these and possibly compat are used by 
Solaris to switch to BSD semantics on certain functions, but I don't 
think we rely on that (anymore).

gen -- generic or general, so this could be a lot, but it might 
contain syslog on some platforms at least.

PW -- might be a compatibility library but none of the functions I found 
are used by PostgreSQL.

resolv -- I don't think we need that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Copy From Insert UNLESS

2006-02-03 Thread James William Pye

Greets folks, [YABLP: Yet Another Bulk Loading Proposal]

The subject of this letter is referring to giving INSERT and COPY FROM STDIN
the ability to alter the destination of rows that violate any constraints named
in a user specified set.

I am seeking, as many others are or have, to improve the performance on bulk
loads to live systems where constraint violations may occur and filtering can
be done more efficiently within the backend.

Primarily, I'm concerned with UNIQUE violations. However, I think tackling the
general case is the wiser action.

The attached patch is *not* being included for any sort of application, and I
make no claims of it functioning as I intended it to or as I may imply it to. =)
The patch only served the purpose of providing rough numbers for the case of
unique violations.

Despite the fact that my experimental patch uses error trapping, that is *not*
what I have in mind for the implementation. I do not want to trap errors upon
insert or copy from. Rather, I wish to implement functionality that would allow
alternate destinations for tuples that violate user specified constraints on
the table, which, by default, will be to simply drop the tuple.

My proposed syntax is along the lines of:

   INSERT INTO table [ ( column [, ...] ) ]
*   [UNLESS CONSTRAINT VIOLATION
 [ON (constraint_name [, ...]) [THEN INSERT INTO table]] [, OR ...]]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) | query }

 and

   COPY tablename [ ( column [, ...] ) ]
FROM { 'filename' | STDIN }
*   [UNLESS CONSTRAINT VIOLATION
 [ON (constraint_name [, ...]) [THEN INSERT INTO table]] [, OR ...]]
   ...

The purpose of the arguably unintuitive THEN INSERT INTO x portion is to provide
the mechanism in which a user can specify the destination table for tuples that
violated the associated set of constraints. Using the OR portion allows the user
to specify additional sets of constraints for different destinations.

A tuple will be withheld from the target table if ANY of the constraints
listed in any of the constraint_name sets is violated. Constraint sets should
not [may not?] reference the same constraint multiple times, even among
different sets.

Example:

 \d dest_table
   Table public.dest_table
  Column |  Type   | Modifiers
 +-+---
  i  | integer | not null
  j  | integer |
 Indexes:
 dest_table_pkey PRIMARY KEY, btree (i)
 Check constraints:
 dest_table_j_check CHECK (j  0)

 CREATE TEMP TABLE pkey_failures (i int, j int);
 CREATE TEMP TABLE check_failures (i int, j int);

 COPY dest_table FROM STDIN
  UNLESS CONSTRAINT VIOLATION
   ON (dest_table_pkey) THEN INSERT INTO pkey_failures
   OR (dest_table_j_check) THEN INSERT INTO check_failures;

For most constraints, this proposed implementation should be fairly easy to
implement. However, the B-Tree index has the uniqueness check within its
insert access method, _bt_check_unique. Perhaps the best solution here is to
expose this check function--with changes, of course--and define a new access
method entry, 'amcheck' or, perhaps, 'amscanforinsert' where state information
would be given back to the caller for later use in the actual insert. (Perhaps
evident, but I'm not as familiar with the index code as I would like to be for
this sort of speculation, so please excuse me if I am not making good sense.)

There is one other annoying change. The constraints specified in an INSERT
UNLESS should be checked before all other unspecified constraints. This is to
elegantly handle the insertion case where two violations can occur, one with a
constraint that the user specified, and one that the user didn't. Regardless of
the order in which constraints are collected for checking, the user specified
ones should be checked first to avoid unwelcome errors from being thrown when
the tuple was going to be tossed anyways.

This proposal purposefully does not discuss bad data errors as I think that
should be seen as a separate issue. Perhaps a future feature within the UNLESS
syntax.


Prior Discussions or Mentions [See the last two.]

implicit abort harmful?
http://archives.postgresql.org/pgsql-general/2003-05/msg00962.php
how to continue a transaction after an error?
http://archives.postgresql.org/pgsql-sql/2000-11/msg00097.php
mass import to table with unique index
http://archives.postgresql.org/pgsql-general/2003-01/msg01465.php
Duplicate key insert question
http://archives.postgresql.org/pgsql-general/2003-07/msg00056.php

Ignore when using COPY FROM (Matthew Kennedy) (small, complaint/fr)
http://archives.postgresql.org/pgsql-general/2000-08/msg00681.php
COPY and duplicates (Ryan Mahoney) (small, complaint/fr)
http://archives.postgresql.org/pgsql-general/2001-07/msg00569.php

Bulk loading using COPY - ignore duplicates? (Lee Kindness?)
http://archives.postgresql.org/pgsql-hackers/2002-01/msg00029.php
Practical error logging for very large COPY statements (Simon Riggs)

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-03 Thread Bruce Momjian


Based on this, I think we should just implement the TRUNCATE/DROP option
for the table, and avoid the idea of allowing non-logged operations on a
table that has any data we want recovered after a crash.

---

Simon Riggs wrote:
 On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:
 
  Seems like a nice optimization.
 
 Negative thoughts: Toast tables have a toast index on them, yes? We have
 agreed that we cannot use the optimization if we have indexes on the
 main table. It follows that we cannot use the optimization if we have
 *any* toasted data, since that would require a pointer between two
 blocks, which would not be correctly recovered following a crash. If we
 log the toast table then there could be a mismatch between heap and
 toast table; if we don't log the toast table there could be a mismatch
 between toast table and toast index.
 
 We can test to see if the toast table is empty when we do ALTER TABLE,
 but loading operations may try to create toasted data rows.
 
 Presumably that means we must either:
 i) abort a COPY if we get a toastable value
 ii) if we get a toastable value, insert the row into a new block, which
 we do logging of, then also log the toast insert and the toast index
 insert - i.e. some blocks we log, others not
 
 This is still useful for many applications, IMHO, but the list of
 restrictions seems to be growing. Worse, we wouldn't know that the toast
 tables were empty until after we did the COPY TO for a pg_dump, so we
 wouldn't be able to retrospectively add an ALTER TABLE command ahead of
 the COPY. 
 
 Thoughts? Hopefully there are some flaws in my thinking here,
 
 Best Regards, Simon Riggs
 
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

52 matches

Mail list logo