Re: [HACKERS] [pgsql-hackers-win32] Win32 signal code - first try

2004-01-15 Thread Claudio Natoli
 

 1. setjmp/longjmp stack manipulation (i.e. ELOG)

Wrote a small program to check this out. As we hoped/assumed/expected, it
looks just fine.

Magnus, are you working on a second run at this stuff?

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Encoding problems in PostgreSQL with XML data

2004-01-15 Thread Hannu Krosing
Merlin Moncure kirjutas K, 14.01.2004 kell 15:49:
 Hannu Krosing wrote:
  I hope that real as-needed-column-by-column translation will be used
  with bound argument queries.
  
  It also seems possible to delegate the encoding changes to after the
  query is parsed, but this will never work for EBCDIC and other funny
  encodings (like rot13 ;).
  
  for these we need to define the actual SQL statement encoding on-wire
 to
  be always ASCII.
 
 In that case, treat the XML document like a binary stream, using
 PQescapeBytea, etc. to encode if necessary pre-query.  Also, the XML
 domain should inherit from bytea, not varchar.

why ?

the allowed characters repertoire in XML is even less than in varchar.

 The document should be stored bit for bit as was submitted.

Or in some pre-parsed form which allows restoration of submitted form,
which could be more for things like xpath queries or subtree extraction.

 If we can do that for bitmaps, why can't we do it for XML documents?  
 
 OTOH, if we are transforming the document down to a more generic format
 (either canonical or otherwise), then the xml could be dealt with like
 text in the ususal way.  Of course, then we are not really storing xml,
 more like 'meta' xml ;)

On the contrary! If there is DTD or Schema or other structure definition
for XML, then we know which whitespace is significant and can do
whatever we like with insignificant whitespace.

It also is ok to store all XML in some UNICODE encoding as this is what
every XML must be convertible to.

its he same as storing ints - you don't care if you specified 1000 ot
1e3 when doing the insert as

hannu=# select 1000=1e3;
 ?column?
--
 t
(1 row)

in the same way the following should also be true

select
'd/'::xml == '?xml version=1.0 encoding=utf-8?\nd/\n'::xml
;

---
Hannu


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] FYI , Intel CC and PostgreSQL , benchmark by pgsql

2004-01-15 Thread jihuang
Hi,
I have a new server and some time to do an interesting simple benchmark.
Compile PostgreSQL 7.4.1R by gcc3.2 and Intel CC 8.0 , and use pgbench 
to evaluate any difference..

Here is the result.

--

CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3052.79-MHz 686-class CPU)
Origin = GenuineIntel  Id = 0xf29  Stepping = 9
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX 

,FXSR,SSE,SSE2,SS,HTT,TM,PBE
Hyperthreading: 2 logical CPUs
real memory  = 3221200896 (3071 MB)
avail memory = 3130855424 (2985 MB)
FreeBSD 5.1-RELEASE-p11
/usr/local/intel_cc_80/bin/icc -V
Intel(R) C++ Compiler for 32-bit applications, Version 8.0   Build 
20031211Z Package ID: l_cc_p_8.0.055_pe057
Copyright (C) 1985-2003 Intel Corporation.  All rights reserved.
FOR NON-COMMERCIAL USE ONLY

gcc -v
Using built-in specs.
Configured with: FreeBSD/i386 system compiler
Thread model: posix
gcc version 3.2.2 [FreeBSD] 20030205 (release)
Application: PostgreSQL 7.4.1
Benchmark: pgbench
Result :
1. IntelCC ( use ports/database/postgresql7 , default )
./pgbench -U pgsql -c 30  test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 30
number of transactions per client: 10
number of transactions actually processed: 300/300
tps = 34.975026 (including connections establishing)
tps = 35.550815 (excluding connections establishing)
2. GNU cc( use ports/database/postgresql7 , default )
./pgbench -U pgsql -c 30  test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 30
number of transactions per client: 10
number of transactions actually processed: 300/300
tps = 38.968321 (including connections establishing)
tps = 39.707451 (excluding connections establishing)




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-hackers-win32] Win32 signal code - first try

2004-01-15 Thread Magnus Hagander
  1. setjmp/longjmp stack manipulation (i.e. ELOG)
 
 Wrote a small program to check this out. As we 
 hoped/assumed/expected, it looks just fine.
Great!
Just checking - we're talking doing this from inside a user APC, right?

 Magnus, are you working on a second run at this stuff?

Not right this moment, but I hope to do that sometime this weekend. (No
time until then). 


A thought there. If we go with the select loop version you had tried
out and just poll (select() with short timeout, poll, loop..), do we
actually *need* the APCs *at all*? Or should we go with a pure-polling
solution? (With signals delivered on a separate thread as we discussed,
of course) The only advantage we gain by the APCs is that they will stop
the Ex functions, no? (It's not much extra code to put it in there,
but if we don't need it..)


//Magnus

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] set search_path and pg_dumpall

2004-01-15 Thread ohp
Hi all,

Still trying to upgrade from 7.3.4 to 7.4.1 with pg_dumpall.
2 problemes:

When a serch_path has been set, pg_dumpall correctly output a alter
database xxx set search_path to 'xxx' but *BEFORE* the schema is created
so it doesn't work.

And about my previous mails, if I pg_dump the database I have problems
with, it reloads without problems but it I pg_dumpall then I have the
problems I wrote before.

Here part of the log file:

ERROR:  extra data after last expected column
ERROR:  missing data for column msg_date2
ERROR:  missing data for column guilde_valide
ERROR:  insert or update on table guilde_rang violates foreign key constraint $1
ERROR:  insert or update on table guilde_perso violates foreign key constraint $1

here are the table schema:

--
-- PostgreSQL database dump
--

SET SESSION AUTHORIZATION 'sdewitte';

SET search_path = public, pg_catalog;

--
-- TOC entry 2 (OID 31845735)
-- Name: messages; Type: TABLE; Schema: public; Owner: sdewitte
--

CREATE TABLE messages (
msg_cod integer DEFAULT nextval('seq_msg_cod'::text) NOT NULL,
msg_date date NOT NULL,
msg_titre character varying(50),
msg_corps text,
msg_date2 timestamp with time zone
);


--
-- TOC entry 4 (OID 37161058)
-- Name: messages_msg_cod_key; Type: INDEX; Schema: public; Owner: sdewitte
--

CREATE INDEX messages_msg_cod_key ON messages USING btree (msg_cod);


--
-- TOC entry 5 (OID 37161059)
-- Name: messages_msg_date_key; Type: INDEX; Schema: public; Owner: sdewitte
--

CREATE INDEX messages_msg_date_key ON messages USING btree (msg_date);


--
-- TOC entry 7 (OID 37161115)
-- Name: msg_date2_messages_key; Type: INDEX; Schema: public; Owner: sdewitte
--

CREATE INDEX msg_date2_messages_key ON messages USING btree (msg_date2);


--
-- TOC entry 6 (OID 37161292)
-- Name: messages_pkey; Type: CONSTRAINT; Schema: public; Owner: sdewitte
--

ALTER TABLE ONLY messages
ADD CONSTRAINT messages_pkey PRIMARY KEY (msg_cod);


--
-- TOC entry 8 (OID 37161665)
-- Name: RI_ConstraintTrigger_37161665; Type: TRIGGER; Schema: public; Owner: sdewitte
--

CREATE CONSTRAINT TRIGGER unnamed
AFTER DELETE ON messages
FROM messages_dest
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE RI_FKey_noaction_del('unnamed', 'messages_dest', 'messages', 
'UNSPECIFIED', 'dmsg_msg_cod', 'msg_cod');


--
-- TOC entry 9 (OID 37161666)
-- Name: RI_ConstraintTrigger_37161666; Type: TRIGGER; Schema: public; Owner: sdewitte
--

CREATE CONSTRAINT TRIGGER unnamed
AFTER UPDATE ON messages
FROM messages_dest
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE RI_FKey_noaction_upd('unnamed', 'messages_dest', 'messages', 
'UNSPECIFIED', 'dmsg_msg_cod', 'msg_cod');


--
-- TOC entry 10 (OID 37161671)
-- Name: RI_ConstraintTrigger_37161671; Type: TRIGGER; Schema: public; Owner: sdewitte
--

CREATE CONSTRAINT TRIGGER unnamed
AFTER DELETE ON messages
FROM messages_exp
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE RI_FKey_noaction_del('unnamed', 'messages_exp', 'messages', 
'UNSPECIFIED', 'emsg_msg_cod', 'msg_cod');


--
-- TOC entry 11 (OID 37161672)
-- Name: RI_ConstraintTrigger_37161672; Type: TRIGGER; Schema: public; Owner: sdewitte
--

CREATE CONSTRAINT TRIGGER unnamed
AFTER UPDATE ON messages
FROM messages_exp
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE RI_FKey_noaction_upd('unnamed', 'messages_exp', 'messages', 
'UNSPECIFIED', 'emsg_msg_cod', 'msg_cod');


--
-- TOC entry 3 (OID 31845735)
-- Name: TABLE messages; Type: COMMENT; Schema: public; Owner: sdewitte
--

COMMENT ON TABLE messages IS 'Messages internes';


--
-- PostgreSQL database dump
--

SET SESSION AUTHORIZATION 'sdewitte';

SET search_path = public, pg_catalog;

--
-- TOC entry 2 (OID 31845953)
-- Name: guilde; Type: TABLE; Schema: public; Owner: sdewitte
--

CREATE TABLE guilde (
guilde_cod integer DEFAULT nextval('seq_guilde_cod'::text) NOT NULL,
guilde_nom character varying(100) NOT NULL,
guilde_description text NOT NULL,
guilde_valide character varying(2) DEFAULT 'O'::character varying NOT NULL
);


--
-- TOC entry 4 (OID 37161158)
-- Name: guilde_guilde_cod_key; Type: INDEX; Schema: public; Owner: sdewitte
--

CREATE INDEX guilde_guilde_cod_key ON guilde USING btree (guilde_cod);


--
-- TOC entry 6 (OID 37161265)
-- Name: guilde_valide_guilde_key; Type: INDEX; Schema: public; Owner: sdewitte
--

CREATE INDEX guilde_valide_guilde_key ON guilde USING btree (guilde_valide);


--
-- TOC entry 5 (OID 37161364)
-- Name: guilde_pkey; Type: CONSTRAINT; Schema: public; Owner: sdewitte
--

ALTER TABLE ONLY guilde
ADD CONSTRAINT guilde_pkey PRIMARY KEY (guilde_cod);


--
-- TOC entry 3 (OID 31845953)
-- Name: TABLE guilde; Type: COMMENT; Schema: public; Owner: sdewitte
--

COMMENT ON TABLE guilde IS 'Liste des guildes';


and the offending lines:
guilde:
18  Septième Compagnie  En Mémoire de ceux qui, 

Re: [HACKERS] FYI , Intel CC and PostgreSQL , benchmark by pgsql

2004-01-15 Thread Tom Lane
jihuang [EMAIL PROTECTED] writes:
 I have a new server and some time to do an interesting simple benchmark.
 Compile PostgreSQL 7.4.1R by gcc3.2 and Intel CC 8.0 , and use pgbench 
 to evaluate any difference..

In my experience, a 10% difference in pgbench results is below the noise
level :-(.  And when you only run 300 transactions, the results are too
unrepeatable even to be worth posting.

If you had run, say, 1-transaction tests and averaged the results
over a couple dozen runs, the mean and standard deviation of those
results might be enough data to tell something.

BTW, when you use a number of clients greater than the scaling
factor, what you're measuring is mostly contention artifacts ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [pgsql-hackers-win32] Microsoft releses Services for Unix

2004-01-15 Thread Andrew Dunstan


Tom Lane wrote:

Claudio Natoli [EMAIL PROTECTED] writes:
 

* Users already have a postgres solution for Win32. It is called Cygwin w/
cygipc. Sure, it is not the most stable solution, but, IMHO, that's not what
prevents people from using it; it is the need to install yet-another bit of
software to support Postgres.
   

Well, the $64 questions that have not been answered are what are the
license terms and redistribution terms for SFU?  If we can bundle the
needed parts of SFU into a binary distribution of Postgres, then there
is no need for users to be aware it is in there.  If we can't, then
I agree that a port based on it would be about as hard to sell as the
Cygwin port.  (Yeah, maybe it'd be more stable and faster, but it'd not
be perceived as a native port.)
I suspect it would be somewhere in between. I can tell you from personal 
experience that getting Cygwin into a large data centre can be very 
hard, if not impossible. The techno-bureaucrats that run them can be 
(understandably) very anal and paranoid about what they allow on their 
machines. If you are running a bank or a nuclear power station it is the 
only sensible way to be. (You might argue that banks and nuclear power 
stations should not be controlled by Windows machines - but that's 
another argument - let's not go there right now ;-)

I don't think I would have encountered as much resistance to getting 
WSFU onto these machines - some, but not as much.

The licensing issue does affect companies like the one I used to work 
for, that wanted to be able to bundle a database with the product.



Given the previous comments about Microsoft's goals in giving this away,
one would think they'd allow it to be bundled in distributions of free
software.  But who knows ...


Not me :-)

 

* I don't buy the argument that moving to SFU will remove a lot of specific
Win32 code. On what evidence is this based on? [personally, I think it'd
only get worse, again, based on little evidence]. Seems to me the bulk of
the Win32 specific code lies with fork/exec, which (unless I'm terribly
mistaken) won't be alleviated by SFU.
   

If SFU doesn't provide a reasonable fork() emulation then it's no help,
agreed.  But again, from what I understand of Microsoft's goals, I'd
think they'd have to provide a good fork().  I think Postgres is a
perfect poster child for the sort of app they want to make easy to port
to Windows.
 

Agreed. I think this is worth exploring, but I don't think it's worth 
stopping what we are doing right now while we explore.

Note that the migration guide says that threads are not supported. So if 
we ever went to a threaded implementation we could not go down this path.

cheers

andrew

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] [pgsql-hackers-win32] Microsoft releses Services for Unix

2004-01-15 Thread Magnus Hagander

 Note that the migration guide says that threads are not 
 supported. So if 
 we ever went to a threaded implementation we could not go 
 down this path.

POSIX threads are supported on the upcoming version 3.5... The migration
guide is apparantly not updated.

//Magnus

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Encoding problems in PostgreSQL with XML data

2004-01-15 Thread Merlin Moncure
Hannu Krosing wrote:

  In that case, treat the XML document like a binary stream, using
  PQescapeBytea, etc. to encode if necessary pre-query.  Also, the XML
  domain should inherit from bytea, not varchar.
 
 why ?
 
 the allowed characters repertoire in XML is even less than in varchar.

Yes, that is correct.  I was resorting to hyperbole...see my reasoning
below.

  The document should be stored bit for bit as was submitted.
 
 Or in some pre-parsed form which allows restoration of submitted form,
 which could be more for things like xpath queries or subtree
extraction.

This is the crucial point:  I'll try and explain my thinking better.

  OTOH, if we are transforming the document down to a more generic
format
  (either canonical or otherwise), then the xml could be dealt with
like
  text in the ususal way.  Of course, then we are not really storing
xml,
  more like 'meta' xml ;)
 
 On the contrary! If there is DTD or Schema or other structure
definition
 for XML, then we know which whitespace is significant and can do
 whatever we like with insignificant whitespace.

According to the XML standard, whitespace is always significant unless
it is outside an element or attribute and thus not part of the real
data.  A DTD or Schema adds constraints, not removes them.  I'm
nitpicking, but this is extra evidence to my philosophy of xml storage
that I'll explain below.

 select
 'd/'::xml == '?xml version=1.0 encoding=utf-8?\nd/\n'::xml

Right: I understand your reasoning here.  Here is the trick:

select '[...]'::xml introduces a casting step which justifies a
transformation.  The original input data is not xml, but varchar.  Since
there are no arbitrary rules on how to do this, we have some flexibility
here to do things like change the encoding/mess with the whitespace.  I
am trying to find away to break the assumption that my xml data
necessarily has to be converted from raw text.

My basic point is that we are confusing the roles of storing and
parsing/transformation.  The question is: are we storing xml documents
or the metadata that makes up xml documents?  We need to be absolutely
clear on which role the server takes on...in fact both roles may be
appropriate for different situations, but should be represented by a
different type.  I'll try and give examples of both situations.

If we are strictly storing documents, IMO the server should perform zero
modification on the document.  Validation could be applied conceptually
as a constraint (and, possibly XSLT/XPATH to allow a fancy type of
indexing).  However there is no advantage that I can see to manipulating
the document except to break the 'C' of ACID.  My earlier comments wrt
binary encoding is that there simply has to be a way to prevent the
server mucking with my document.

For example, if I was using postgres to store XML-EDI documents in a DX
system this is the role I would prefer.  Validation and indexing are
useful, but my expected use of the server is a type of electronic xerox
of the incoming document.  I would be highly suspicious of any
modification the server made to my document for any reason.  

Now, if we are storing xml as content, say for a content database
backing a web page, the server takes on the role of a meta-data storage
system.  Now, it is reasonable to assume the server might do additional
processing besides storage and validation.  The character encoding of
the incoming data is of little or no importance because the xml will
almost certainly undergo an additional transformation step after
extraction from the database.  Flexibility, simplicity, and utility are
the most important requirements, so text transformation to a default
encoding would be quite useful.

Based on your suggestions I think you are primarily concerned with the
second example.  However, in my work I do a lot of DX and I see the xml
document as a binary object.  Server-side validation would be extremely
helpful, but please don't change my document!

So, I submit that we are both right for different reasons.

Regards,
Merlin


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Encoding problems in PostgreSQL with XML data

2004-01-15 Thread Hannu Krosing
Merlin Moncure kirjutas N, 15.01.2004 kell 18:43:
 Hannu Krosing wrote:

  select
  'd/'::xml == '?xml version=1.0 encoding=utf-8?\nd/\n'::xml
 
 Right: I understand your reasoning here.  Here is the trick:
 
 select '[...]'::xml introduces a casting step which justifies a
 transformation.  The original input data is not xml, but varchar.  Since
 there are no arbitrary rules on how to do this, we have some flexibility
 here to do things like change the encoding/mess with the whitespace.  I
 am trying to find away to break the assumption that my xml data
 necessarily has to be converted from raw text.
 
 My basic point is that we are confusing the roles of storing and
 parsing/transformation.  The question is: are we storing xml documents
 or the metadata that makes up xml documents?  We need to be absolutely
 clear on which role the server takes on...in fact both roles may be
 appropriate for different situations, but should be represented by a
 different type.  I'll try and give examples of both situations.
 
 If we are strictly storing documents, IMO the server should perform zero
 modification on the document.  Validation could be applied conceptually
 as a constraint (and, possibly XSLT/XPATH to allow a fancy type of
 indexing).  However there is no advantage that I can see to manipulating
 the document except to break the 'C' of ACID.  My earlier comments wrt
 binary encoding is that there simply has to be a way to prevent the
 server mucking with my document.
 
 For example, if I was using postgres to store XML-EDI documents in a DX
 system this is the role I would prefer.  Validation and indexing are
 useful, but my expected use of the server is a type of electronic xerox
 of the incoming document.  I would be highly suspicious of any
 modification the server made to my document for any reason.  

The current charset/encoding support can be evil in some cases ;(

The only solution seems to be keeping both server and client encoding as
ASCII (or just disable it)

The proper path to encodings must unfortunately do the encoding
conversions *after* parsing, when it is known, which parts of the
original query string should be changed. 

Or, as you suggested, always encode anything outside plain ASCII (n32
and n127), both on input (can be done client-side) and output (IIRC
needs another type with different output function)

 Based on your suggestions I think you are primarily concerned with the
 second example.  However, in my work I do a lot of DX and I see the xml
 document as a binary object.  Server-side validation would be extremely
 helpful, but please don't change my document!

So the problem is not exactly XML, but rather problems with changing
encodings of binary strings that should not be changed.

I hope (but I'm not sure) that keeping client and server encodings the
same should prevent that.

 So, I submit that we are both right for different reasons.

Seems so.

-
Hannu



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] Postgres v.7.3.4 - Performance tuning

2004-01-15 Thread Michael Brusser
I need to provide recommendations for optimal value for the
shared_buffers. This has to do with some optional feature
we have in our application. If turned on - a number of extra
tables involved in the operations.
I don't have an experience with PG perf. testing and I guess that 
I may need to account for the size of the tables, perhaps the size
of index tables as well..?

Any hints, or pointers to related reading would be very appreciated.
Mike.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers-win32] Win32 signal code - first try

2004-01-15 Thread Claudio Natoli


   1. setjmp/longjmp stack manipulation (i.e. ELOG)
  
  Wrote a small program to check this out. As we 
  hoped/assumed/expected, it looks just fine.
 Great!
 Just checking - we're talking doing this from inside a user APC, right?

Yes, of course! :-)

[From a user APC executing on the main thread.]



  Magnus, are you working on a second run at this stuff?

 Not right this moment, but I hope to do that sometime this weekend. (No
time until then). 

Tell me about it [sigh].


 A thought there. If we go with the select loop version you had tried
 out and just poll (select() with short timeout, poll, loop..), do we
 actually *need* the APCs *at all*? Or should we go with a pure-polling
 solution? (With signals delivered on a separate thread as we discussed,
 of course) The only advantage we gain by the APCs is that they will stop
 the Ex functions, no? (It's not much extra code to put it in there,
 but if we don't need it..)

Hmmm, not sure. Possibly we can, although it depends on where else we  might
end up needing to wait. I'd be unreluctant to undo your good work on this
until we are sure it is safe to...

Cheers,
Claudio



--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] set search_path and pg_dumpall

2004-01-15 Thread Tom Lane
[EMAIL PROTECTED] writes:
 When a serch_path has been set, pg_dumpall correctly output a alter
 database xxx set search_path to 'xxx' but *BEFORE* the schema is created
 so it doesn't work.

Hm.  It's worse than that really: in ALTER DATABASE SET, we are trying
to check the search path in the wrong context.  Consider:

regression=# create database foo;
CREATE DATABASE
regression=# alter database foo set search_path to 'fooschema';
ERROR:  schema fooschema does not exist

If we are not connected to database foo then we have no way to tell
whether the requested search path is valid.  Presently the backend
is checking the path against the schemas in the *current* database,
which is obviously bogus.

A closely related case is this (which also represents a scenario
where pg_dumpall will fail at the moment):

regression=# create user foo;
CREATE USER
regression=# alter user foo set search_path to 'fooschema';
ERROR:  schema fooschema does not exist

I am inclined to think that raising an error here isn't a good idea
either, since it's quite possible that the user's search path isn't
meant to be used in the current database.  We don't even have
any way to tell which database it is meant to be used in.

So I'm leaning to the thought that we shouldn't change pg_dumpall's
behavior, but instead should relax the backend's error checking so
that it doesn't reject these cases.  To be specific, I think that
for ALTER DATABASE/USER SET search_path, we only want to do a
syntactic check that the search path is valid (ie, it's a list of
identifiers), and not insist that it refer to existing schemas.

The only case where checking schema existence is arguably useful is
ALTERing the current database --- but if we do that, then we still
have to do something to change pg_dumpall's behavior, and existing
pg_dumpall scripts are still broken.  So I'm content to say that we
won't check regardless of which database is the target.

Next question is how exactly to make the change.  It seems like a really
clean solution would involve adding another GucSource or GucContext
value to denote that we're trying to validate an ALTER ... SET value,
and changing the API for GUC variable assign hooks so that
assign_search_path could find out that that's what we're doing.  Should
we go to that much trouble, and if so what should the modified API be?
At the moment search_path seems to be the only GUC variable that has a
context-sensitive checking routine, so maybe a quick kluge for just this
variable is sufficient.  I have a feeling the problem may come up in the
future with other variables, though.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [GENERAL] Bug and/or feature? Complex data types in tables...

2004-01-15 Thread Tom Lane
[ moved to pg-hackers, since it's *way* off topic for -general ]

Michael Glaesemann [EMAIL PROTECTED] writes:
 On Jan 3, 2004, at 2:31 PM, Tom Lane wrote:
 The thing we are missing (i.e., what makes it crash) is an internal
 representation that allows a tuple to be embedded as a field of a 
 larger
 tuple.  I've looked at this a couple of times, and each time concluded
 that it was more work than I could afford to spend at the moment.  The
 support-such-as-it-is for tuple return values uses a structure that has
 embedded pointers, and it doesn't make any effort to get rid of
 out-of-line TOAST pointers within the tuple.  Neither one of those
 things are acceptable for a tuple that's trying to act like a Datum.

 Would you mind explaining this a little more, or pointing me to where I 
 can learn more about this?

Well, to make composite data types into real first-class citizens, we
have to be able to represent their values as ordinary Datums that don't
act differently from run-of-the-mill Datums, except when some operation
that actually wants to understand the contents of the value is invoked.
I think the only workable representation is as a variable-length datum
along the lines of

int32 length word (overall length of datum)
OID type indicator (OID of the composite type)
header fields similar to a normal on-disk tuple
null bitmap if needed
values of fields (themselves also Datums)

It's possible we could leave out the type OID, but given that we found
it useful to include an element type OID in array headers, I'm betting
we want one for composite-type values too.  Without it, we must always
know the exact composite type makeup from context.  (But see below.)

Now, this structure could be TOASTed as a whole, since it's just a
varlena data type.  But we cannot expect the toasting routines to look
inside it --- that would imply that it's not like other varlena data
types after all.  That means that the contained fields had better not be
out-of-line TOAST value references, because there's no way to keep track
of them and keep from deleting the referenced value too soon.  (It would
be workable for them to be compressed inline, but I suspect we don't
really want that either, since it'd interfere with attempts to compress
the overall datum.)  So somehow we'd need to expand out any toasted
component fields, at least before attempting to store such a datum on
disk.  Not sure where to do that cleanly.

The other point was that what's actually returned at the moment from a
function-returning-tuple is a Datum that contains a pointer to a
TupleTableSlot, not a pointer to a datum of this kind.  (Look in
executor/functions.c and executor/execQual.c to see the related code.)
We'd need to change that API, which would be a good thing to do, but
it'd no doubt break some user-written functions.

In particular I do not know how we'd handle functions declared to return
RECORD --- in general, they may need to return composite types that are
defined on-the-fly and don't have any associated type OID.  The
TupleTableSlot convention works for this since it can include a
tupledescriptor that was built on the fly.  We can't have tupdescs
embedded in datums stored on disk, however.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Using storage MAIN

2004-01-15 Thread Christopher Kings-Lynne
Hi guys,

Quick question about how column storage works.  If you set a TEXT field 
to be storage MAIN, does this place a limit on the amount of data that 
can be stored in the row (eg. 8k?)

Chris

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] Make SHOW command subqueriable?

2004-01-15 Thread Christopher Kings-Lynne
Is this a neat idea?

SELECT * FROM (SHOW ALL);

eg.

SELECT * FROM tab WHERE character_length(f)  (SHOW block_size);

etc.

Chris

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Make SHOW command subqueriable?

2004-01-15 Thread Kris Jurka


On Fri, 16 Jan 2004, Christopher Kings-Lynne wrote:

 Is this a neat idea?

 SELECT * FROM (SHOW ALL);

So neat in fact that it has been implemented.

SELECT * FROM pg_settings;

Kris Jurka



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Make SHOW command subqueriable?

2004-01-15 Thread Christopher Kings-Lynne
So neat in fact that it has been implemented.

SELECT * FROM pg_settings;
Damn! I knew that as well!  *sigh*

I'm not thinking right from my current 'shocking postgres performance 
problems nightmare day' today :(

Think massively concurrent table that almost everything on the site 
relates to causing heaps of select queries to pile up and use all I/O, 
getting worse, etc...

Chris

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Using storage MAIN

2004-01-15 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Quick question about how column storage works.  If you set a TEXT field 
 to be storage MAIN, does this place a limit on the amount of data that 
 can be stored in the row (eg. 8k?)

No, because it will still be forced out-of-line if that's the only way
to make the row fit.  The source code comments may be helpful:

/*--
 * attstorage tells for VARLENA attributes, what the heap access
 * methods can do to it if a given tuple doesn't fit into a page.
 * Possible values are
 *'p': Value must be stored plain always
 *'e': Value can be stored in secondary relation (if relation
 * has one, see pg_class.reltoastrelid)
 *'m': Value can be stored compressed inline
 *'x': Value can be stored compressed inline or in secondary
 * Note that 'm' fields can also be moved out to secondary storage,
 * but only as a last resort ('e' and 'x' fields are moved first).
 *--
 */

You could force no-compression, no-out-of-line semantics by setting it
to PLAIN, and then it *would* fail if over 8K.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] Log rotation for pg_autovacuum

2004-01-15 Thread Christopher Kings-Lynne
Hi,

What's the best way to do log rolling with pg_autovacuum?  It doesn't 
seem to have any syslog options, etc.  Using 'tee' maybe?

Chris

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Log rotation for pg_autovacuum

2004-01-15 Thread Matthew T. O'Connor
Christopher Kings-Lynne wrote:

What's the best way to do log rolling with pg_autovacuum?  It doesn't 
seem to have any syslog options, etc.  Using 'tee' maybe?


I got an email from Mark Hollow saying that he had implemented a syslog 
patch for pg_autovacuum.  Don't know how good it is, but it might be a 
start.

Mark can you post it to the list?

Matthew

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] [GENERAL] Bug and/or feature? Complex data types in

2004-01-15 Thread Joe Conway
Tom Lane wrote:
int32 length word (overall length of datum)
OID type indicator (OID of the composite type)
header fields similar to a normal on-disk tuple
null bitmap if needed
values of fields (themselves also Datums)
It's possible we could leave out the type OID, but given that we found
it useful to include an element type OID in array headers, I'm betting
we want one for composite-type values too.  Without it, we must always
know the exact composite type makeup from context.  (But see below.)
Makes sense. But see below...

Now, this structure could be TOASTed as a whole, since it's just a
varlena data type.  But we cannot expect the toasting routines to look
inside it --- that would imply that it's not like other varlena data
types after all.  That means that the contained fields had better not be
out-of-line TOAST value references, because there's no way to keep track
of them and keep from deleting the referenced value too soon.
Why wouldn't we handle this just like we do when we build an array from 
elemental datums (i.e. allocate sufficient space and copy the individual 
datums into the structure)?

Continuing the analogy:

int32   size;  /* overall length of datum */
int flags; /* null-bitmap indicator, others reserved */
Oid relid; /* OID of the composite type */
int16   t_natts;   /* number of attributes */
bits8   t_bits[1]; /* null bitmap if needed */
Datum  *values /* values of fields */
values would be built similar to how its done in 
construct_md_array/CopyArrayEls/ArrayCastAndSet

The overlying datatype would be similar to anyarray.

AFAICS SQL2003 (and SQL99) defines something similar to this as a row 
type. It looks like this:

  ROW ( column definition list )

But it also seems to equate a table's-row type to a row type in 
section 4.8 (Row types):

  A row type is a sequence of (field name data type) pairs, called 
fields. It is described by a row type descriptor. A row type descriptor 
consists of the field descriptor of every field of the row type.

  The most specific type of a row of a table is a row type. In this 
case, each column of the row corresponds to the field of the row type 
that has the same ordinal position as the column.

So maybe as an extension to the standard, we could allow something like:

  ROW composite_type_name

Example:

CREATE TABLE foo (id int, tup ROW (f1 int, f2 text));

or ...

CREATE TABLE bar (f1 int, f2 text);
CREATE TABLE foo (id int, tup ROW bar);
The other point was that what's actually returned at the moment from a
function-returning-tuple is a Datum that contains a pointer to a
TupleTableSlot, not a pointer to a datum of this kind.
If you had something akin to arrayin/arrayout, would this still need to 
be changed?

Joe

---(end of broadcast)---
TIP 8: explain analyze is your friend