Re: [HACKERS] COPY formatting

2004-03-19 Thread Karel Zak
On Thu, Mar 18, 2004 at 10:16:36AM -0500, Tom Lane wrote:
> Passing in a relation OID is probably a bad idea anyway, as it ties this
> API to the assumption that COPY is only for complete relations.  There's
> been talk before of allowing a SELECT result to be presented via the
> COPY protocol, for instance.  What might be a more usable API is
> 
> COPY OUT:
>   function formatter_out(text[]) returns text
> COPY IN:
>   function formatter_in(text) returns text[]
> 
> where the text array is either the results of or the input to the
> per-column datatype I/O routines.  This makes it explicit that the
> formatter's job is solely to determine the column-level wrapping and
> unwrapping of the data.  I'm assuming here that there is no good reason
> for the formatter to care about the specific datatypes involved; can you
> give a counterexample?

 The idea was put maximum information about tuple to formatter, and what
 will formatter do with this information is a formatter problem.

> >  It's pity  that main idea of  current COPY is based  on separated lines
> >  and it is not more common interface for streaming data between FE and BE.
> 
> Yeah, that was another concern I had.  This API would let the formatter
> control line-level layout but it would not eliminate the hard-wired
> significance of newline.  What's worse, there isn't any clean way to
> deal with reading quoted newlines --- the formatter can't really replace
> the default quoting rules if the low-level code is going to decide
> whether a newline is quoted or not.

 I think latest  protocol version works with blocks of  data and no with
 lines and client PQputCopyData() returns a block -- only docs says that
 it is row of table.

> We could possibly solve that by specifying that the text output or input
> (respectively) is the complete line sent to or from the client,
> including newline or whatever other line-level formatting you are using.
> This still leaves the problem of how the low-level COPY IN code knows
> what is a complete line to pass off to the formatter_in routine.  We
> could possibly fix this by adding a second input-control routine
> 
>   function formatter_linelength(text) returns integer
> 
> which is defined to return -1 if the input isn't a complete line yet

 But  formatter_linelength()  will  need   some  context  information  I
 think. The others  words some  struct with formatter  specific internal
 data. And  for more  difficult formats  like XML  you need  some others
 context data (parser data) too.

 Maybe there can be some global  exported struct (like for triggers) and
 functions that is written in C  can use it. It means for simple formats
 like CSV you can  use non-C functions and for formats  like XML you can
 use C functions. And  if it will intereting for PL  developers they can
 add support for access to this structs to their languages.

> (i.e., read some more data, append to the buffer, and try again), or
> >= 0 to indicate that the first N bytes of the buffer represent a
> complete line to be passed off to formatter_in.  I don't see a way to
> combine formatter_in and formatter_linelength into a single function
> without relying on "out" parameters, which would again confine the
> feature to format functions written in C.

> It's a tad annoying that we need two functions for input.  One way that
> we could still keep the COPY option syntax to be just
>   FORMAT csv
> is to create an arbitrary difference in the signatures of the input
> functions.  Then we could have coexisting functions
>   csv(text[]) returns text
>   csv(text) returns text[]
>   csv(text, ...) returns int
> that are referenced by "FORMAT csv".

 It sounds good, but I think we  both not full sure about it now, right?
 CSV support will probably better add by DELIMITER extension.

Karel

-- 
 Karel Zak  <[EMAIL PROTECTED]>
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] pg_advisor schema proof of concept

2004-03-19 Thread Fabien COELHO

Dear hackers,

please find attached a quick proof of concept for a 'pg_advisor' schema.

Well, the name is 'xpg_advisor' at the time, because it is not a system
schema hence it cannot starts with 'pg_'.

It appears that some support functions would be useful. I've noticed some
are available from pg_catalog, but I have not found yet what I was looking
for.

If you do not like some advices, just "DROP VIEW the_advice;"
If you do not like advices at all, just "DROP SCHEMA xpg_advisor;"

Have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]--
-- $Id: pg_advisor.sql,v 1.3 2004/03/19 10:41:47 coelho Exp $
--
-- pg_advisor maybe future system schema?
--
-- at the time it is xpg_advisor as pg_ is reserved.
--

DROP SCHEMA xpg_advisor CASCADE;

CREATE SCHEMA xpg_advisor;

COMMENT ON SCHEMA xpg_advisor 
IS 'various advices about database design or performance' ;

SET search_path TO xpg_advisor,pg_catalog;

--
-- DESIGN ADVICES (da_*)
--

--
-- tables without primary keys
--

CREATE VIEW da_tables_without_primary_key AS 
SELECT n.nspname AS Schema, c.relname AS Name
FROM pg_class AS c 
  JOIN pg_namespace AS n ON (c.relnamespace=n.oid)
WHERE
  -- no comments about system catalogs.
  n.nspname NOT IN ('pg_catalog','pg_toast','information_schema','xpg_advisor')
  -- only tables
  AND c.relkind = 'r' 
  -- no primary key
  AND NOT c.relhaspkey
ORDER BY Schema ASC, Name ASC;

COMMENT ON VIEW da_tables_without_primary_key
IS 'it is better to have a primary key on your tables';

-- SELECT * FROM da_tables_without_primary_key;

--
-- tables with composite primary keys?
-- 

CREATE VIEW da_tables_with_composite_primary_key AS
SELECT n.nspname AS Schema, c.relname AS Name
FROM pg_class AS c 
  JOIN pg_namespace AS n ON (c.relnamespace=n.oid)
  JOIN pg_constraint AS o ON (o.conrelid=c.oid)
WHERE
  -- no comments about system catalogs.
  n.nspname NOT IN ('pg_catalog','pg_toast','information_schema','xpg_advisor')
  -- only tables
  AND c.relkind = 'r' 
  -- with a primary key
  AND c.relhaspkey
  -- the primary key constraint
  AND contype = 'p'
  -- more than 1 element in conkey
  AND array_upper(o.conkey, 1)>1
ORDER BY Schema ASC, Name ASC;

COMMENT ON VIEW da_tables_with_composite_primary_key
IS 'it may be considered a better design to have simple primary keys';

-- SELECT * FROM da_tables_with_composite_primary_key;

--
-- other ideas for later:
--
-- foreign keys which do not exactly match their target key types?
-- foreign keys which do match their target types but with different sizes?
-- tables with large primary keys (such as TEXT, VARCHAR(64)...)?
--


-- count summary of design advices?
-- how to SELECT COUNT(*) FROM [the view of which I have the oid]?
-- Do I need a function that count tuples in a table given its oid?
-- CREATE FUNCTION pg_count_tuples(INTEGER oid) RETURNS INTEGER...
CREATE VIEW design_advices_summary AS
SELECT c.relname AS Name, d.description AS Description
-- , pg_count_tuples(c.oid) AS Count
FROM pg_class AS c
  JOIN pg_description AS d ON (d.objoid=c.oid)
  JOIN pg_namespace AS n ON (c.relnamespace=n.oid)
WHERE
  -- xpg_advisor views
  n.nspname = 'xpg_advisor'
  -- only design advices
  AND c.relname LIKE 'da_%';

-- SELECT * FROM design_advices_summary;

--
-- PERFORMANCE ADVICES (pa_*)
--
-- no usable primary key index for foreign key referencial integrity checks?
-- no usable index for foreigh key on deletes?
--

-- count summary of performance advices?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] UnixWare/CVS Tip/initdb.c needs to use threads flags...

2004-03-19 Thread Bruce Momjian
Larry Rosenman wrote:
> > I thought that once you include libpthread in libpq, that you don't have
> > to mention it again then you use libpq.  Is your platform different
> > somehow in this regard?
> >
> > I seem to remember this problem with libcrypt and libpq.  Is this the
> > same problem?
> >
> > I see that initdb is just the first of many /bin programs to be
> > compiled, so if we have to add the thread lib, we will have to do it for
> > all the bin programs.  Yikes.  Why wasn't this a problem for 7.4?
> 7.4 had initdb as a Shell Script.
> the 7.4.x libpq didn't have any pthread_* references in it, that I see
> on my box.

Ah, yes.  We added the thread-local storage to handle SIGPIPE.  The
problem is that initdb isn't the only place.  If you comment out initdb
from the Makefile in src/bin, does the next make fail too?  I bet it
does.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-www] [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-19 Thread Chris Ryan
--- David Garamond <[EMAIL PROTECTED]> wrote:
 --snip --
> 
> Also, we're targetting the developers right? Please do not consider 
> ourselves as being too stupid to differentiate between postgresql.org
> 
 -- snip --

IMO this point of view is a short-sighted and narrow one. In
addition to trying to bring a more structured and developed
infrastructure to 3rd party developers this should also ease the
difficulty many non-developers have in finding related software to the
PostgreSQL project. This fact as well as the others should be taken
into consideration when making the decision on which route to go.

Chris Ryan

__
Do you Yahoo!?
Yahoo! Search - Find what you’re looking for faster
http://search.yahoo.com

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] COPY formatting

2004-03-19 Thread Chris Browne
[EMAIL PROTECTED] (Andrew Dunstan) writes:
> Karel Zak wrote:
>
>> Hi,
>>
>> in TODO is item: "* Allow dump/load of CSV format". I don't think
>> it's clean idea. Why CSV and why not something other? :-)
>>
>> A why not allow to users full control of the format by they own
>> function. It means something like:
>> COPY tablename [ ( column [, ...] ) ]
>> TO { 'filename' | STDOUT }
>> [ [ WITH ]  [ BINARY ]
>>  [ OIDS ]
>>  [ DELIMITER [ AS ] 'delimiter' ]
>>  [ NULL [ AS ] 'null string' ]
>>  [ FORMAT funcname ] ]
>>   
>> The formatting
>> function API can be pretty simple:
>>
>> text *my_copy_format(text *attrdata, int direction, int
>> nattrs, int attr, oid attrtype, oid relation)
>>
>> -- it's pseudocode of course, it should be use standard fmgr
>> interface.
>> It's probably interesting for non-binary COPY version.
>
> Interesting ... The alternative might be an external program to munge
> CSVs and whatever other format people want to support and then call
> the exisiting COPY- either in bin or contrib. I have seen lots of
> people wanting to import CSVs, and that's even before we get a Windows
> port.

I know Jan Wieck has been working on something like this, with a bit
of further smarts...

 - By having, alongside, a table definition, the table can be created 
   concurrently;

 - A set of mapping functions can be used, so that if, for instance,
   the program generating the data was Excel, and you have a field with
   values like 37985, 38045, or 38061, they can respectively be mapped
   to '2004-01-01', '2004-03-01', and '2004-03-17';

 - It can load whatever data is loadable, and use Ethernet-like
   backoffs when it encounters bad records so that it loads all the data
   that is good, and leaves a bundle of 'crud' that is left over.

He had been prototyping it in Tcl; I'm not sure how far a port to C
has gotten.  It looked pretty neat; it sure seems better to put the
"cleverness" in userspace than to try to increase the complexity of
the postmaster...
-- 
output = ("cbbrowne" "@" "cbbrowne.com")
http://cbbrowne.com/info/linuxxian.html
Have you heard of the new Macsyma processor?  It has three instructions --
LOAD, STORE, and SKIP IF INTEGRABLE.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] COPY formatting

2004-03-19 Thread Andrew Dunstan


Karel Zak wrote:

The problem with  CSV is that it will correctly  work with new protocol
only. Because old  versions of  clients are newline  sensitive. And CSV
can contains newline in by quotation marks defined attributes:
"John", "Smith", "The White House
1600 Pennsylvania Avenue NW
Washington, DC 20500", "male", "open source software office"
It is one record. 

(Long Live President Smith!)

I have never seen such a beast, and if I had been asked off the cuff 
would have said that it was probably illegal, except that I know of no 
standard actually defining the format. Perhaps others (Joshua Drake or 
Josh Berkus?) have wider experience. I think in any case we should 
ignore those for now and handle the straightforward case.

I *have* seen monstrosities like fields that do not begin with the quote 
character but then break into a quote, e.g.:

1,2,a,123"abc""def",6,7,8

 
It's difficult to say it :-), but your DELIMITER idea is better than my
suggested API. Andrew,  go ahead. I thought about  some data streaming,
but COPY is probably bad place for it.

 



Just to be clear, I have no current plans for doing this. I have about 3 
pg related things I am working on, in addition to doing paid work so I 
can eat! If I do it it will not be for quite a while.

There are some wrinkles, though, concerning the interaction of CSV's 
notion of escaping and  COPY's notion of escaping. If someone want to 
undertake this I can flesh those out in a further email.

Someone was asking the other day about a newbie or student project - 
this might be a good one (impact relatively isolated, very easy to test) 
although possibly not as sexy as some might like.

cheers

andrew

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] Inherited tables

2004-03-19 Thread Francisco Reyes
On a recent thread Stephan Szabo mentioned some issues with inheritance.

***
> On Thu, 19 Feb 2004, Stephan Szabo wrote:
>
> > There are two separate things here that are gotchas
> > The first is that unique constraints don't inherit, and foreign keys must
> > refer to a unique constraint.
>
> > These are both deficiencies in inheritance and the constraints in
> > question.
***

Anyone could comment on those problems?
I am working on a design which will heavily use inherittance and I think
on the long run the limitations above could make things to be more work
for me.

I would be willing to put some money forward to have someone work on
improving those limitations. Anyone knows who may be a good candidate to
work on this?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] COPY formatting

2004-03-19 Thread Tom Lane
Karel Zak <[EMAIL PROTECTED]> writes:
>>> It's pity  that main idea of  current COPY is based  on separated lines
>>> and it is not more common interface for streaming data between FE and BE.
>> 
>> Yeah, that was another concern I had.  This API would let the formatter
>> control line-level layout but it would not eliminate the hard-wired
>> significance of newline.  What's worse, there isn't any clean way to
>> deal with reading quoted newlines --- the formatter can't really replace
>> the default quoting rules if the low-level code is going to decide
>> whether a newline is quoted or not.

>  I think latest  protocol version works with blocks of  data and no with
>  lines and client PQputCopyData() returns a block -- only docs says that
>  it is row of table.

But you can't assume that the client will send blocks that are
semantically significant.  For instance, if psql is reading a file to
send with \copy, how's it going to know how the file is formatted?
It's just gonna send disk-block-sized messages, and the backend has to
discover the semantic boundaries for itself.

>  It sounds good, but I think we  both not full sure about it now, right?
>  CSV support will probably better add by DELIMITER extension.

Yeah, without people beating on our door for such a hook, it seems like
Andrew's DELIMITER idea is the best thing to do for now.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] COPY formatting

2004-03-19 Thread Fernando Nasser
Andrew Dunstan wrote:


Karel Zak wrote:

The problem with  CSV is that it will correctly  work with new protocol
only. Because old  versions of  clients are newline  sensitive. And CSV
can contains newline in by quotation marks defined attributes:
"John", "Smith", "The White House
1600 Pennsylvania Avenue NW
Washington, DC 20500", "male", "open source software office"
It is one record.


(Long Live President Smith!)

I have never seen such a beast,
Export from a spreadsheet where people have formated the cell with the 
address on it.

Regards,
Fernando
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] COPY formatting

2004-03-19 Thread Karel Zak
On Fri, Mar 19, 2004 at 09:39:58AM -0500, Tom Lane wrote:
> Karel Zak <[EMAIL PROTECTED]> writes:
> >>> It's pity  that main idea of  current COPY is based  on separated lines
> >>> and it is not more common interface for streaming data between FE and BE.
> >> 
> >> Yeah, that was another concern I had.  This API would let the formatter
> >> control line-level layout but it would not eliminate the hard-wired
> >> significance of newline.  What's worse, there isn't any clean way to
> >> deal with reading quoted newlines --- the formatter can't really replace
> >> the default quoting rules if the low-level code is going to decide
> >> whether a newline is quoted or not.
> 
> >  I think latest  protocol version works with blocks of  data and no with
> >  lines and client PQputCopyData() returns a block -- only docs says that
> >  it is row of table.
> 
> But you can't assume that the client will send blocks that are
> semantically significant.  For instance, if psql is reading a file to
> send with \copy, how's it going to know how the file is formatted?

 And what  \n in attibutes data  in CSV? I think CSV  format doesn't use
 some escape for  newline char. It means psql with \copy  cannot be sure
 with CSV.

Karel

-- 
 Karel Zak  <[EMAIL PROTECTED]>
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] COPY formatting

2004-03-19 Thread Tom Lane
Karel Zak <[EMAIL PROTECTED]> writes:
>  The problem with  CSV is that it will correctly  work with new protocol
>  only. Because old  versions of  clients are newline  sensitive.

Why?  The client-side code doesn't have any real say over the meaning of
the data, at least not in psql-class clients.  I suppose a client app
that tries to interpret the data could get confused, but psql sure
doesn't do that.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Question on restoring and compiled plans

2004-03-19 Thread Tom Lane
Richard Huxton <[EMAIL PROTECTED]> writes:
> If I pg_restore a DB and don't ANALYSE before a pl/pgsql trigger activates 
> then that trigger's going to be planned based on the default stats forever 
> isn't it?

Only for the life of the current backend(s).

If that's still too long for you, you could do a CREATE OR REPLACE FUNCTION
that doesn't actually change anything about the trigger function; that
will force current cached plans to be thrown away.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] COPY formatting

2004-03-19 Thread Andrew Dunstan
Fernando Nasser wrote:

Andrew Dunstan wrote:



Karel Zak wrote:

The problem with CSV is that it will correctly work with new protocol
only. Because old versions of clients are newline sensitive. And CSV
can contains newline in by quotation marks defined attributes:
"John", "Smith", "The White House
1600 Pennsylvania Avenue NW
Washington, DC 20500", "male", "open source software office"
It is one record.


(Long Live President Smith!)

I have never seen such a beast,


Export from a spreadsheet where people have formated the cell with the 
address on it.

Well, I just tried with OpenOffice on my RH9 box, and it translated the 
line breaks in the cell into spaces in the CSV. When I replaced them 
with line breaks in a text editor and reloaded it treated them as 
separate rows.

I don't have a Windows box handy on which I can test Excel's behaviour.

cheers

andrew

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] pg_advisor schema proof of concept

2004-03-19 Thread Fabien COELHO

> please find attached a quick proof of concept for a 'pg_advisor' schema.

Here is a "larger" but nevertheless still quick proof of concept, alas
with some buggy PL/pgSQL that I wrote with my little finger.

It implements some foreign key type checks for which I submitted be
patches some time ago.

The more I think about it, the more I find it should be the way to go,
rather than having a new external tool.

-- 
Fabien Coelho - [EMAIL PROTECTED]--
-- $Id: pg_advisor.sql,v 1.13 2004/03/19 14:55:39 coelho Exp $
--
-- pg_advisor maybe future system schema?
--
-- at the time it is xpg_advisor as pg_ is reserved.
--

DROP SCHEMA xpg_advisor CASCADE;

CREATE SCHEMA xpg_advisor;

COMMENT ON SCHEMA xpg_advisor 
IS 'various advices about database design or performance' ;

SET search_path TO xpg_advisor,pg_catalog;

--
-- DESIGN ADVICES (da_*)
--

--
-- tables without primary keys
--

CREATE VIEW da_tables_without_primary_key AS 
SELECT n.nspname AS Schema, c.relname AS Name
FROM pg_class AS c 
  JOIN pg_namespace AS n ON (c.relnamespace=n.oid)
WHERE
  -- no comments about system catalogs.
  n.nspname NOT IN ('pg_catalog','pg_toast','information_schema','xpg_advisor')
  -- only tables
  AND c.relkind = 'r' 
  -- no primary key
  AND NOT c.relhaspkey
ORDER BY Schema ASC, Name ASC;

COMMENT ON VIEW da_tables_without_primary_key
IS 'it is better to have a primary key on your tables';

-- SELECT * FROM da_tables_without_primary_key;

--
-- tables with composite primary keys?
-- 

CREATE VIEW da_tables_with_composite_primary_key AS
SELECT n.nspname AS Schema, c.relname AS Name
FROM pg_class AS c 
  JOIN pg_namespace AS n ON (c.relnamespace=n.oid)
  JOIN pg_constraint AS o ON (o.conrelid=c.oid)
WHERE
  -- no comments about system catalogs.
  n.nspname NOT IN ('pg_catalog','pg_toast','information_schema','xpg_advisor')
  -- only tables
  AND c.relkind = 'r' 
  -- with a primary key
  AND c.relhaspkey
  -- the primary key constraint
  AND contype = 'p'
  -- more than 1 element in conkey
  AND array_upper(o.conkey, 1)>1
ORDER BY Schema ASC, Name ASC;

COMMENT ON VIEW da_tables_with_composite_primary_key
IS 'it may be considered a better design to have simple primary keys';

-- SELECT * FROM da_tables_with_composite_primary_key;

--
-- missing array_index function
--

CREATE OR REPLACE FUNCTION array_index(anyarray, anyelement) 
RETURNS INTEGER AS '
DECLARE
  tab ALIAS FOR $1;
  item ALIAS FOR $2;
  i INTEGER;
BEGIN
  i := 1;
  LOOP
IF i > array_upper(tab, 1) THEN
RETURN NULL;
END IF;
IF tab[i] = item THEN
RETURN i;
END IF;
i := i+1;
  END LOOP;
END;'
LANGUAGE plpgsql;

COMMENT ON FUNCTION array_index(anyarray, anyelement)
IS 'missing array_index function... should be already there!';

--
-- internal foreign key constraint...
--
CREATE VIEW in_foreign_key_contraint AS
SELECT
  -- constraint
  c.oid AS constraint, 
  -- referencing table
  c.conrelid AS crel, 
  cca.attname AS ccol,
  cca.atttypid AS ctyp,
  cca.atttypmod AS cmod,
  cca.attlen AS clen,
  -- referenced table, foreign part
  c.confrelid AS frel, 
  fka.attname AS fcol, 
  fka.atttypid AS ftyp, 
  fka.atttypmod AS fmod,
  fka.attlen AS flen,
  array_index(c.confkey, fka.attnum) AS component
FROM pg_constraint AS c
 JOIN pg_attribute AS cca ON (c.conrelid=cca.attrelid)
 JOIN pg_attribute AS fka ON (c.confrelid=fka.attrelid)
WHERE
  -- foreign key constraint
  c.contype='f'
  -- column attribute in constraint
  AND cca.attnum = ANY (c.conkey)
  -- foreign key attribute
  AND fka.attnum = ANY (c.confkey)
  -- matching constraints
  AND array_index(c.confkey, fka.attnum)=array_index(c.conkey, cca.attnum)
;

--
-- foreign keys which do not exactly match their target key types?
--

CREATE VIEW da_foreign_key_type_dont_match AS
SELECT 
  nc.nspname AS Schema,
  cc.relname AS Name,
  fkc.ccol AS AttName,
  format_type(fkc.ctyp, fkc.cmod) AS ColumnType,
  nf.nspname AS FSchema,
  cf.relname AS FName,
  fkc.fcol AS FAttName,
  format_type(fkc.ftyp, fkc.fmod) AS ForeignType,
  fkc.component AS Component
FROM in_foreign_key_contraint AS fkc 
  JOIN pg_class AS cc ON (fkc.crel=cc.oid)
  JOIN pg_namespace AS nc ON (cc.relnamespace=nc.oid)
  --JOIN pg_attribute AS ac ON (fkc.ccol=ac.oid)
  JOIN pg_class AS cf ON (fkc.frel=cf.oid)
  JOIN pg_namespace AS nf ON (cf.relnamespace=nf.oid)
  --JOIN pg_attribute AS af ON (fkc.fcol=af.oid)
WHERE
  -- no comments about system catalogs.
  nc.nspname NOT IN('pg_catalog','pg_toast','information_schema','xpg_advisor')
  AND 
  nf.nspname NOT IN('pg_catalog','pg_toast','information_schema','xpg_advisor')
  -- only tables (redundant?)
  AND cc.relkind='r' AND cf.relkind='r'
  -- non matching type
  AND fkc.ctyp!=fkc.ftyp
ORDER BY Schema ASC, Name ASC, FSchema ASC, FName ASC, Component ASC;

COMMENT ON VIEW da_foreign_key_type_dont_match
IS 'non matching foreing key component, maybe a bad design';

-- SELECT * FROM da_foreign_key_type_dont_match;

--
-- others?
--
-- foreign keys which do match thei

Re: [HACKERS] Question on restoring and compiled plans

2004-03-19 Thread Richard Huxton
On Friday 19 March 2004 15:04, Tom Lane wrote:
> Richard Huxton <[EMAIL PROTECTED]> writes:
> > If I pg_restore a DB and don't ANALYSE before a pl/pgsql trigger
> > activates then that trigger's going to be planned based on the default
> > stats forever isn't it?
>
> Only for the life of the current backend(s).

Ah that's alright then. ANALYZE is usually the first thing I do after a 
restore anyway, but thought I'd check how important it was that I remember.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] COPY formatting

2004-03-19 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> There are some wrinkles, though, concerning the interaction of CSV's 
> notion of escaping and  COPY's notion of escaping. If someone want to 
> undertake this I can flesh those out in a further email.

Please do that, so that the info is in the archives in case someone else
wants to tackle the project.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] COPY formatting

2004-03-19 Thread Andrew Dunstan
I wrote:

Fernando Nasser wrote:

[snip re multiline CSV fields]

I have never seen such a beast,


Export from a spreadsheet where people have formated the cell with 
the address on it.

Well, I just tried with OpenOffice on my RH9 box, and it translated 
the line breaks in the cell into spaces in the CSV. When I replaced 
them with line breaks in a text editor and reloaded it treated them as 
separate rows.

I don't have a Windows box handy on which I can test Excel's behaviour.

Clearly my experience is out of date - I just found a web ref to Excel 
doing just this. Oh, well, it doesn't seem to me an insurmountable problem.

cheers

andrew

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[HACKERS] looking for an 'array_index' function?

2004-03-19 Thread Fabien COELHO

Dear hackers,

I'm looking for an array_index function that would return the index of the
occurence of some element in an array. I noticed that I can know whether a
element is in the array with ANY, but I also need the actual index.

There is no such function under that name, and I cannot find any that
would do the job in the doc. I made a quick workaround in plpgsql, but I
would have expected the function to be there.

Am I unlucky or just blind?

I know that looking for such a function is a proof a bad design, but I
don't feel responsible for the pg_catalog.* design;-)

Thanks in advance for any pointer,

Have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] COPY formatting

2004-03-19 Thread Tom Lane
Karel Zak <[EMAIL PROTECTED]> writes:
>  And what  \n in attibutes data  in CSV? I think CSV  format doesn't use
>  some escape for  newline char. It means psql with \copy  cannot be sure
>  with CSV.

I think CSV only allows embedded newlines that are either escaped, or
inside quotes.  COPY doesn't currently have the notion of a quote
character, but that was part of Andrew's proposal ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Question on restoring and compiled plans

2004-03-19 Thread Richard Huxton
If I pg_restore a DB and don't ANALYSE before a pl/pgsql trigger activates 
then that trigger's going to be planned based on the default stats forever 
isn't it?

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] 7.4.2 Build broken on (Sparc) Solaris 7 and 8

2004-03-19 Thread Bruce Momjian
> Environment:
> 
> SunOS 5.8 Generic_108528-27 sun4u sparc SUNW,Ultra-250
> gcc version 3.3.1
> PostgreSQL-7.4.2
> ./configure --with-java --enable-thread-safety
> 
> "make" results in:
> 
> gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes 
> -Wmissing-declarations -fPIC -I. -I../../../src/include   
> -DFRONTEND -DSYSCONFDIR='"/usr/local/pgsql/etc"'  -c -o thread.o 
> thread.c
> thread.c: In function `pqGetpwuid':
> thread.c:116: error: too many arguments to function `*getpwuid_r*'
> 
> 
> Environment:
> 
> SunOS 5.7 Generic_106541-29 sun4u sparc SUNW,UltraSPARC-IIi-Engine
> gcc version 3.3.1
> PostgreSQL-7.4.2
> ./configure --with-java --enable-thread-safety
> 
> "make' results in:
> 
> gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes 
> -Wmissing-declarations -fPIC -I. -I../../../src/include   
> -DFRONTEND -DSYSCONFDIR='"/usr/local/pgsql/etc"'  -c -o thread.o 
> thread.c
> thread.c: In function `pqGetpwuid':
> thread.c:116: error: too many arguments to function `*getpwuid_r*'
> thread.c: In function `pqGethostbyname':
> thread.c:189: error: `resbuf' undeclared (first use in this function)
> thread.c:189: error: (Each undeclared identifier is reported only once
> thread.c:189: error: for each function it appears in.)
> 
> Diff'ing thread.c between 7.4.1 and 7.4.2, it *looks* like, at first
> blush, nothing changed that should affect the relevant code.
> 
> Anybody got any idea what's broken?

Unfortunately, I know exactly what is broken.  First, pre7.4.2 didn't
use thread.c properly.  Now that it does, you are breaking on this
issue:

/*
 * Early POSIX draft of getpwuid_r() returns 'struct passwd *'.
 *getpwuid_r(uid, resultbuf, buffer, buflen)
 * Do we need to support it?  bjm 2003-08-14
 */
/* POSIX version */
getpwuid_r(uid, resultbuf, buffer, buflen, result);

Notice the comment.  Do we have to support getpwuid_r that returns
passwd *, and doesn't take a fourth argument?  Yea, for Solaris 7 & 8,
we now we now do.  I think you can get yours working by just changing
the call to:
 
   result = getpwuid_r(uid, resultbuf, buffer, buflen);

I will have to add configure tests for this and it will work properly
for you in 7.4.3.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position "CREATE FUNCTION" bug fix)

2004-03-19 Thread Tom Lane
[ moving thread to hackers ]

Fabien COELHO <[EMAIL PROTECTED]> writes:
> However, I still stick with my "bad" simple idea because the simpler the
> better, and also because of the following example:
> ...
> psql> SELECT count_tup('pg_shadow');
> ERROR:  syntax error at or near "FRM" at character 22
> CONTEXT:  PL/pgSQL function "count_tup" line 4 at for over execute statement

> As you can notice, the extract is not in the submitted query, so there
> is no point to show it there.

Yeah.  However, I dislike your solution because it confuses the cases of
a syntax error in the actually submitted query, and a syntax error in an
internally generated query.  We should keep these cases clearly separate
because clients may want to do different things.  For a syntax error in
the submitted input, what you probably want to do is edit and resubmit
the original query --- that's the case I was thinking about in saying
that a GUI client like pgadmin would want to set the editing cursor in
the original input window.  But this action is nonsensical if the syntax
error is from a generated query.  Perhaps the GUI client could be smart
enough to pop up a new window in which one could edit and resubmit the
erroneous function definition.

Even in psql's simplistic error handling, you want to distinguish the
two cases.  There's no point in showing the entire original query; one
line worth of context is plenty.  But you very probably do want to see
all of a generated query.  So I don't want the backend sending back
error reports that look the same in both cases.

The original design concept for the 'P' (position) error field is that
it would be used to locate syntax errors in the *original query*, and
so its presence is a cue to the client code to go in the direction of
setting the editing cursor.  (Note the protocol specification says
"index into the original query string".)  We have in fact misimplemented
it, because it is being set for syntax errors in internally generated
queries too.

I was already planning to modify plpgsql to send back the full text of
generated queries when there is an error.  My intention was to supply
this just as part of the CONTEXT stack, that is instead of your example
of

ERROR:  syntax error at or near "FRM" at character 22
CONTEXT:  PL/pgSQL function "count_tup" line 4 at for over execute statement

you'd get something like

ERROR:  syntax error at or near "FRM" at character 22
CONTEXT:  Executing command "SELECT COUNT(*) AS c FRM pg_shadow"
PL/pgSQL function "count_tup" line 4 at for over execute statement

However it might be better to invent a new error-message field that
carries just the text of the SQL command, rather than stuffing it into
CONTEXT.  (This is similar to your original patch, but different in
detail because I'm envisioning sending back generated queries, never the
submitted query.  Regurgitating the submitted query is just a waste of
bandwidth.)  The plus side of that would be that it'd be easy to extract
for syntax-error highlighting.  The minuses are that existing clients
would fail to print such a field (the protocol spec says to ignore
unknown fields), and that there is no good way to cope with nested
queries.

A possible compromise is to put the text of the generated SQL command
into a new field only if the error is a syntax error, and put it into
the CONTEXT stack otherwise.  Syntax errors couldn't be nested so at
least that problem goes away.  This seems a bit messy though.

The other thing to think about is whether we should invent a new field
to carry syntax error position for generated queries, rather than making
'P' do double duty as it does now.  If we don't do that then we have to
change the protocol specification to reflect reality.  In any case I
think it has to be possible to tell very easily from the error message
whether the 'P' position refers to the submitted query or a generated
query.

Comments anyone?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] COPY formatting

2004-03-19 Thread Thomas Swan

>
>
> Karel Zak wrote:
>
>> The problem with  CSV is that it will correctly  work with new protocol
>> only. Because old  versions of  clients are newline  sensitive. And CSV
>> can contains newline in by quotation marks defined attributes:
>>
>> "John", "Smith", "The White House
>> 1600 Pennsylvania Avenue NW
>> Washington, DC 20500", "male", "open source software office"
>>
>> It is one record.
>>
>
> (Long Live President Smith!)
>
> I have never seen such a beast, and if I had been asked off the cuff
> would have said that it was probably illegal, except that I know of no
> standard actually defining the format. Perhaps others (Joshua Drake or
> Josh Berkus?) have wider experience. I think in any case we should
> ignore those for now and handle the straightforward case.
>
> I *have* seen monstrosities like fields that do not begin with the quote
> character but then break into a quote, e.g.:
>
> 1,2,a,123"abc""def",6,7,8
>

I have dealt with both, frequently.   The CSV format allows you to begin a
block of text with the quote.  Newlines are included in that quoted space.
  If qoutes are included in the field then the quotes are double quotes to
denote they are not part of the quoted space.

Also, the following is valid.

1,2,,,"",,,""

"" is empty.

1,2,3,"",

The 4 quotes denote a single double quote.

Writing simple CSV converts that just explode on commas and newlines miss
these oddities.

Try exporting an Access table with a Memo field (containg multiple lines)
to CSV.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] COPY formatting

2004-03-19 Thread Josh Berkus
Thomas, Andrew, Karel,

Thomas is correct: many applications which read or make CSVs will accept a 
newline if it is enclosed in a quote.   

> > I *have* seen monstrosities like fields that do not begin with the quote
> > character but then break into a quote, e.g.:
> >
> > 1,2,a,123"abc""def",6,7,8

This I have never seen.   It looks like a hackish error to me.   What 
application is it from?

Frankly, I would expect any CSV reader to error out on the above, and would be 
annoyed if it did not.

Overall, I assert again that approaching this issue through COPY enhancements 
is really not the way to go.We should be looking at a client utility, 
like pg_import and pg_export. The primary purpose of COPY is bulk loads 
for backup/restore, and I'm against doing a lot of tinkering which might make 
it less efficient or introduce new issues into what's currently very 
reliable.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] COPY formatting

2004-03-19 Thread Andrew Dunstan
Bruce Momjian wrote:

OK, so for the separator, quote, and escape options:

separator is ,
quote is "
escape is "
so if the quote and escape are the same, then a double denotes a
single?
 

Yes. i.e. with the above settings "abc""def" -> abc"def

cheers

andrew

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] COPY formatting

2004-03-19 Thread Andrew Dunstan
Josh Berkus wrote:

Overall, I assert again that approaching this issue through COPY enhancements 
is really not the way to go.We should be looking at a client utility, 
like pg_import and pg_export. The primary purpose of COPY is bulk loads 
for backup/restore, and I'm against doing a lot of tinkering which might make 
it less efficient or introduce new issues into what's currently very 
reliable.

 

That's not unreasonable. I floated my idea as an alternative to a much 
more radical proposal. If we decided against it we should remove the 
TODO item.

As against that, if we don't do this then I think we should embrace 
these utility programs more, possibly bringing them into the distribution.

cheers

andrew

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] COPY formatting

2004-03-19 Thread Bruce Momjian
Andrew Dunstan wrote:
> Josh Berkus wrote:
> 
> >
> >Overall, I assert again that approaching this issue through COPY enhancements 
> >is really not the way to go.We should be looking at a client utility, 
> >like pg_import and pg_export. The primary purpose of COPY is bulk loads 
> >for backup/restore, and I'm against doing a lot of tinkering which might make 
> >it less efficient or introduce new issues into what's currently very 
> >reliable.
> >
> >  
> >
> 
> That's not unreasonable. I floated my idea as an alternative to a much 
> more radical proposal. If we decided against it we should remove the 
> TODO item.
> 
> As against that, if we don't do this then I think we should embrace 
> these utility programs more, possibly bringing them into the distribution.

CSV seems to be the most widely requested conversion format.  Anything
else is probably a one-off job that should be done in perl or sed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] looking for an 'array_index' function?

2004-03-19 Thread Joe Conway
Fabien COELHO wrote:
Am I unlucky or just blind?
Unlucky I guess. No such function, least not yet.

Joe

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] [HACKERS] fsync method checking

2004-03-19 Thread Kevin Brown
I wrote:
> Note, too, that the preferred method isn't likely to depend just on the
> operating system, it's likely to depend also on the filesystem type
> being used.
> 
> Linux provides quite a few of them: ext2, ext3, jfs, xfs, and reiserfs,
> and that's just off the top of my head.  I imagine the performance of
> the various syncing methods will vary significantly between them.

For what it's worth, my database throughput for transactions involving
a lot of inserts, updates, and deletes is about 12% faster using
fdatasync() than O_SYNC under Linux using JFS.

I'll run the test program and report my results with it as well, so
we'll be able to see if there's any consistency between it and the live
database.




-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])