Re: [GENERAL] yet another image: db or filesystem ? question

2007-07-17 Thread PFC


a) if the images are in the filesystem (and also under the web root), no  
problem.  Just use 


b) if the images are in the database...


	You use  and setup URL rewriting in your  
webserver so that a HTTP request on "images/filename.jpg" becomes  
"serve_image?fname=filename.jpg" with serve_image being a php, jsp,  
whatever script.
	This way when the performance starts to suck too much you can simply  
serve images off the filesystem very easily, just remove the URL rewriting.
	Please use your primary key (an integer) as filename, don't let the users  
name files on your filesystem !!


	If you are trying to control user access rights to files, it is much  
faster to use an authenticator plugin, or lighttpd's mod_sec_download.


	In both cases the web application is only invoked to decide if the user  
can access the image or not ; it does not actually handle the (potentially  
large) file. It is the webserver that does it, and webservers are  
optimized for this purpose.


	If it's for an intranet where you don't expect lots of traffic, though,  
the PHP echoing a bytea it got from postgres works well...



do I need to create a temporary file first in order to use the   
tag?  Or is there some other HTML way of doing this?




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] optimizing postgres

2007-07-13 Thread PFC



The parsing has turned out to be pretty intense. It takes about 10-20
minutes for any project. When we are parsing data, it really slows
down the site's  response. I tested serving static webpages from
apache, endless loops in php , but the choke point seems to be doing
any other query on postgres when constructing a php page during
parsing.


Do you do lots of INSERTs without explicitly using transactions ?
You also need to run EXPLAIN ANALYZE on your most frequent queries.
It is very possible the slowdown is just from a forgotten index.


As an example, the original designer specified separate tables for
each project. Since they were all the same structure, I suggested
combining them into a single table with a project_id column, but he
said it would take too long to query. I was suspicious, but I went
with his design anyway.


From the small size of the dataset I don't see a justification for 
this...


It turned out he was right for our current set up. When I needed to
empty the project table to re-parse data, doing a cascading delete
could take up to 10 minutes! I cut re-parsing time in half by just
dropping the table and creating a new one. Which was an okay thing to
do, since the table only belonged to one project anyway. But I hate to
think how long it would have taken to do a delete, cascading to child
tables, if all the project data was in a single table.


	That's probably because you forgot to create an index on the referenced  
column. They are not created automatically.



Since I'm not an expert in Postgres database design, I'm assuming I've
done something sub-optimal. Are there some common techniques for
tuning postgres performance? Do we need beefier hardware?

Or is it a problem with how PHP or apache pools connections to the
database?


	It depends on a lot of stuff, but the first thing is to run EXPLAIN  
ANALYZE on your queries and post the results here.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Polymorphic delete help needed

2007-07-06 Thread PFC

O>>   Here is how I implemented something very similar (in PHP) :


- "Node" class and several derived classes.
	- "nodes" table which contains the fields for the base class with  
node_id as a PK and a field which indicates the class
	- "nodes_***" tables which contain the extra fields for the derived  
class "***", having node_id as a primary key.


As you see this is very similar to what you got.
All the "nodes_***" tables have :
node_id PRIMARY KEY REPERENCES nodes( id ) ON DELETE CASCADE



Your method is not exactly what I am doing because I do not have the  
"nodes" table.  I have only the "nodes_***" tables.  But, I believe your  
approach has many advantages.


OK, I see.
The advantage of my approach is that :

	- all common fields that are shared by all derived classes are stored in  
the "base" nodes table
	- the PK on this table uniquely identifies any instance, whatever its  
object class
	- the tables holding the derived classes's extra members (ie columns)  
merely refer to it via a FK which is clean


	When you want to get a Node instance by its id, the ORM grabs the row in  
the main table, which contains the type information, then instantiates the  
right object class, grabs the extra row in the extra tables (which shares  
the same id), and gives this data to the instantiated object which then  
uses it to populate its member variables.


	Therefore, there can be no duplicates, and you only need to look into one  
table to know if an object with a specific id exists or not, or to delete  
it.


	Say Nodes can be, for instance, Photos, or Albums, or Articles ; say an  
Album contains Photos and Articles and you want to display its contents :


	- select from relations r JOIN nodes n ON n.id=r.child_id WHERE  
r.parent_id = (id of album)
	=> gets the ids of children along with extra info stored in the relations  
table (like sort order, etc)


- scan the result set and group it by object class
- for each object class :
		- grab the rows into the extra tables for the ids of objects of this  
class using SELECT WHERE id IN (...)

- instantiate and populate

So if you have an Album with 5 Articles and 10 Photos, you do 3 queries 
-
- get 15 rows from Nodes JOIN relations
- get 5 rows from nodes_articles
- get 10 rows from nodes_photos

It's pretty efficient. I believe you can tweak Rails' ORM into doing 
that.

	FYI it's a PHP ORM that I wrote over the course of various projects.  
Unfortunately PHP is very bad at generic/meta programming, so the  
implementation is quite convoluted, and I did not opensource it because  
the documentation would be quite an effort to write ;)


Rails has an inheritance ability but they do it differently.  They  
simply have a "nodes" table with all the "nodes_***" tables smashed in  
to it.  I did not like that approach at all.


	Hm, another reason why I don't really like the Rails approach... do they  
at least use unique IDs ? Please say yes ;)


But doing what you are doing, I believe I can very nicely fit into Rails  
and (obviously) PostgreSQL.  Plus, your suggestion of moving entries to  
a "trash" bin seems very wise.



The first problem that creates is it makes it hard to do a constraint
on the name/id pair.  I thought about writing a function that would


	Is this ClassName / id only found in the "main" table or is every FK  
implemented as a ClassName / id pair ?


The ClassName / id pair is found only in the relationships table.  There  
are two instances of it however: parent and child.


	I believe duplicating the ClassName and storing it everywhere to be bad.  
With my approach you only need to use the id in reference since two  
instances of the same base class cannot have the same id regardless of  
their class, and you only need the id to instantiate a row, whatever its  
class.


	If you want to specify that a class instance can only have children of  
specific classes (or their subclasses), for example a FriendList can only  
contain instances of People, or whatever, you can use a constraint trigger  
which will check the class of the referenced row against a table  
describing the allowed parent/child combinations.












---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Polymorphic delete help needed

2007-07-06 Thread PFC



I am doing a project using Ruby On Rails with PostgreSQL as the
database.  I have not seen the term polymorphic used with databases
except with Rails so I will quickly describe it.

Instead of holding just an id as a foreign key, the record holds a
"type" field which is a string and an id.  The string is the name of
the table to which the id applies.  (That is slightly simplified).


Here is how I implemented something very similar (in PHP) :

- "Node" class and several derived classes.
	- "nodes" table which contains the fields for the base class with node_id  
as a PK and a field which indicates the class
	- "nodes_***" tables which contain the extra fields for the derived class  
"***", having node_id as a primary key.


As you see this is very similar to what you got.
All the "nodes_***" tables have :
node_id PRIMARY KEY REPERENCES nodes( id ) ON DELETE CASCADE

	So when I delete a Node, the derived class records are automatically  
deleted in the auxiliary tables.
	Since there can be only one distinct node_id per node, you can put ON  
DELETE CASCADE safely.


	Now, for your tree-thing, the fact that references are polymorphic isn't  
important since they all refer to the same main "nodes" table.


	However as soon as you say "when a node no longer has any relations  
pointing to it", then you get to choose between various flavors of garbage  
collection and reference counting...


	Personally I like to move the deleted or orphaned rows to a "trash"  
folder so they can be recovered if the user did delete the wrong node for  
instance. Once in a while i "empty trash".



The first problem that creates is it makes it hard to do a constraint
on the name/id pair.  I thought about writing a function that would


	Is this ClassName / id only found in the "main" table or is every FK  
implemented as a ClassName / id pair ?



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [pgsql-general] In memory tables/databases

2007-07-05 Thread PFC


Why not have a table type that writes no WAL and is truncated whenever  
postgres starts? Such a table could then be put in a ramdisk tablespace  
and there would be no transaction atomicity repercussions. Is there  
something I'm missing?


Is this not in the TODO (if not already scheduled for next version ?)
Check ALTER TABLE SET PERSISTENCE ...

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] a JOIN on same table, but 'slided over'

2007-06-26 Thread PFC


OK, check...

test=> CREATE TABLE test (id INTEGER PRIMARY KEY);
test=> INSERT INTO test SELECT random()*5 + n*10 FROM  
generate_series( 1,10 ) AS n;

test=> SELECT * FROM test LIMIT 10;
  id
-
  11
  23
  31
  41
  52
  63
  70
  85
  94
 103

test=> ANALYZE test;
ANALYZE

-- Self Join 1

test=> EXPLAIN ANALYZE SELECT t1.id AS current_id, t2.id AS next_id
FROM test t1, test t2
WHERE t2.id = ( SELECT min(id) FROM test AS t3 WHERE t3.id > t1.id )
ORDER BY t1.id;
QUERY  
PLAN

---
 Sort  (cost=26703.19..26953.19 rows=10 width=8) (actual  
time=5240.392..5271.529 rows=9 loops=1)

   Sort Key: t1.id
   ->  Hash Join  (cost=2691.00..18398.37 rows=10 width=8) (actual  
time=106.588..5179.737 rows=9 loops=1)

 Hash Cond: ((subplan) = t2.id)
 ->  Seq Scan on test t1  (cost=0.00..1441.00 rows=10 width=4)  
(actual time=0.013..34.782 rows=10 loops=1)
 ->  Hash  (cost=1441.00..1441.00 rows=10 width=4) (actual  
time=106.420..106.420 rows=10 loops=1)
   ->  Seq Scan on test t2  (cost=0.00..1441.00 rows=10  
width=4) (actual time=0.007..43.077 rows=10 loops=1)

 SubPlan
   ->  Result  (cost=0.03..0.04 rows=1 width=0) (actual  
time=0.023..0.023 rows=1 loops=19)

 InitPlan
   ->  Limit  (cost=0.00..0.03 rows=1 width=4) (actual  
time=0.021..0.022 rows=1 loops=19)
 ->  Index Scan using test_pkey on test t3   
(cost=0.00..1029.59 rows=3 width=4) (actual time=0.020..0.020 rows=1  
loops=19)

   Index Cond: (id > $0)
   Filter: (id IS NOT NULL)
 Total runtime: 5295.677 ms

-- Self Join 2

test=> set enable_hashjoin TO 0;
test=> EXPLAIN ANALYZE SELECT t1.id AS current_id, t2.id AS next_id
FROM test t1, test t2
WHERE t2.id = ( SELECT min(id) FROM test AS t3 WHERE t3.id > t1.id )
ORDER BY t1.id;
  QUERY  
PLAN

---
 Sort  (cost=30806.48..31056.48 rows=10 width=8) (actual  
time=2876.249..2903.011 rows=9 loops=1)

   Sort Key: t1.id
   ->  Merge Join  (cost=9745.82..22501.66 rows=10 width=8) (actual  
time=2547.830..2820.347 rows=9 loops=1)

 Merge Cond: (t2.id = "inner"."?column2?")
 ->  Index Scan using test_pkey on test t2  (cost=0.00..2828.26  
rows=10 width=4) (actual time=0.035..67.747 rows=10 loops=1)
 ->  Sort  (cost=9745.82..9995.82 rows=10 width=4) (actual  
time=2547.779..2582.889 rows=10 loops=1)

   Sort Key: (subplan)
   ->  Seq Scan on test t1  (cost=0.00..1441.00 rows=10  
width=4) (actual time=0.060..2487.728 rows=10 loops=1)

 SubPlan
   ->  Result  (cost=0.03..0.04 rows=1 width=0)  
(actual time=0.023..0.023 rows=1 loops=10)

 InitPlan
   ->  Limit  (cost=0.00..0.03 rows=1 width=4)  
(actual time=0.021..0.022 rows=1 loops=10)
 ->  Index Scan using test_pkey on  
test t3  (cost=0.00..1029.59 rows=3 width=4) (actual time=0.020..0.020  
rows=1 loops=10)

   Index Cond: (id > $0)
   Filter: (id IS NOT NULL)
 Total runtime: 2923.804 ms

-- DISTINCT ON

test=> EXPLAIN SELECT DISTINCT ON (t1.id) t1.id AS current_id, t2.id AS  
next_id

FROM test t1 JOIN test t2 ON t2.id > t1.id
ORDER BY t1.id, t2.id;
   QUERY PLAN
-
 Unique  (cost=729806679.75..746473346.41 rows=10 width=8)
   ->  Sort  (cost=729806679.75..738140013.08 rows=33 width=8)
 Sort Key: t1.id, t2.id
 ->  Nested Loop  (cost=0.00..100028973.00 rows=33 width=8)
   ->  Seq Scan on test t1  (cost=0.00..1441.00 rows=10  
width=4)
   ->  Index Scan using test_pkey on test t2   
(cost=0.00..583.61 rows=3 width=4)

 Index Cond: (t2.id > t1.id)
(7 lignes)

This one takes much longer (I interrupted it).

-- Using a function

CREATE TYPE test_type AS ( current_id INTEGER, next_id INTEGER );

CREATE OR REPLACE FUNCTION testfunc( )
RETURNS SETOF test_type
LANGUAGE plpgsql
AS
$$
DECLARE
_rowtest_type;
BEGIN
_row.current_id = NULL;

FOR _row.next_id IN SELECT

Re: [GENERAL] Ordering in SELECT statement

2007-06-26 Thread PFC

On Jun 26, 2007, at 14:41 , [EMAIL PROTECTED] wrote:
and what I need is the following ("old fashion", that is, the "SPACE"  
is another character whose ASCII value is before any other LATIN  
letter's!!)

AB  CD
AB  EF
ABAB
ABD  E


What you don't want :

peufeu=> SELECT column1 FROM (VALUES ('ABCD'), ('A BCD'), ('abcd'),  
('ABcd'), ('AB'), ('AbC d')) AS foo  ORDER BY column1;

 column1
-
 AB
 abcd
 AbC d
 ABcd
 ABCD
 A BCD
(6 lignes)

What you want :

peufeu=> SELECT column1 FROM (VALUES ('ABCD'), ('A BCD'), ('abcd'),  
('ABcd'), ('AB'), ('AbC d')) AS foo  ORDER BY string_to_array( column1, '  
' );

 column1
-
 A BCD
 AB
 AbC d
 abcd
 ABcd
 ABCD
(6 lignes)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] a JOIN on same table, but 'slided over'

2007-06-26 Thread PFC



Now, I'd like to make a JOIN-ed query of that table with itself, so that
I'd get rows paiwise: every row containing data from *two* rows of the
original TEST table so, that those data come from rows of consequtive
ID's - not neceserly (depending on the TEST table contents) continuesly
consequtive. Like:

SELECT * from view_of_test;
 id | id+X | thread | thread+X | info  | info+X
+--++--+---+-
  2 |3 |763 |  764 | A | B
  3 |6 |764 |5 | B | C
  6 |8 |  5 |88946 | C | Cats
  8 |9 |  88946 |69315 | Cats  | Eifel
  9 |   10 |  69315 |96379 | Eifel | G
-
Is there an SQL construct to get it?


	I would use a plpgsql procedure, select all the rows ORDER BY id, keep  
the current and last row in a variable, and that's it.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] minimum function

2007-06-23 Thread PFC



Check out greatest() and least()... (I think ;)


On Sat, 23 Jun 2007 18:35:36 +0200, Raymond O'Donnell <[EMAIL PROTECTED]> wrote:


On 23/06/2007 17:17, Gunther Mayer wrote:


Any way I can achieve that on one line? I.e. I want it simpler than
 IF arg1 < arg2 THEN
   RETURN arg1;
ELSE
   RETURN arg2;
END IF;


That looks pretty simple already, but why not enclose it in a pl/pgsql  
function - something like:


create function minimum(a1 integer, a2 integer) returns integer as
$$
begin
   if a1 < a2 then
 return a1;
   else
 return a2;
   end if;
end;
$$
language plpgsql;

- and then you can call it in one line:

select minimum(5, 4);


Ray.


---
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
[EMAIL PROTECTED]
---

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] [PGSQL 8.2.x] INSERT+INSERT

2007-06-22 Thread PFC



I think it would be greatly helpful if the insert...returning could be
seen as a "select" statement and, thus, being usable in the way I
have described. I suspect that the insert...returning is actually
implemented as an inser plus a select.


	You can create a function which does the INSERT RETURNING and returns the  
result, and use this function in a subsequent INSERT.


You can also write, in plpgsql :

FOR a,b,c IN SELECT ... LOOP
INSERT INTO table1 (...) VALUES (a,b) RETURNING your_id INTO 
_var;
INSERT INTO table2 (...) VALUES (c , _var );
END LOOP

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Accent insensitive search

2007-06-21 Thread PFC


Hi. I have a few databases created with UNICODE encoding, and I would  
like to be able to search with accent insensitivity. There's something  
in Oracle (NLS_COMP, NLS_SORT) and SQL Server (don't remember) to do  
this, but I found nothing in PostgreSQL, just the 'to_ascii' function,  
which AFAIK, doesn't work with UNICODE.


	The easiest way is to create an extra column which will hold a copy of  
your text, with all accents removed. You can also convert it to lowercase  
and remove apostrophes, punctuation etc. Said column is kept up to date  
with a trigger.

Python is suitable for this (use unicodedata.normalize).
	Keeping a copy of the processed data will speed up search versus WHERE  
remove_accents( blah ) = 'text', even with a function index.
	Note that this function could be written in C and use a table on the  
first 64K unicode symbols for speedup.


See attached file.

create_ft_functions.sql
Description: Binary data

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread PFC


2. Oracle, Microsoft, and IBM have a "lot" to fear in the sense of a  
database like PostgreSQL. We can compete in 90-95% of cases where people  
would traditionally purchase a proprietary system for many, many  
thousands (if not hundreds of thousands) of dollars.


	Oracle also fears benchmarks made by people who don't know how to tune  
Oracle properly...


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP

2007-06-16 Thread PFC


I wouldn't call Python *strongly* typed, but I do know what you mean.  I  
think.


	It is strongly typed (string + int = error), just not statically typed  
(but you saw what I mean ;)



"PHP: very loosely typed, does whatever it wants"
yeah php got a life of its own! sure be a lazy programmer and blame
sql injection etc crap on php or try http://www.xdebug.org/ and
others.


No need.
	I either use pg_query_params() which automagically handles all quoting,  
or an ORM which does the same.

There is no reason to include strings in SQL statements except laziness.
	MySQL does not have a mysql_query_params() for PHP, so you have to write  
one, it's pretty simple.


	Python's (and perl) strength in this respect is that they make it easier  
to use the safe solution, ie :

query( "sql with ? or $1 or %s", arg, arg, arg )

PEAR::DB is horrendous.













---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PHP sucks!! - was: persistent db connections in PHP

2007-06-16 Thread PFC


I've seen similar negative comments before on this list about PHP, and  
I'm curious to know what informs them.


	Maybe the fact that, when I coded a database object/form library, it took  
me LONGER to ensure that empty strings / NULLs / zero valued floats and  
integers / etc were handled correctly than it took me to code the entire  
thing ?


I use PHP quite a bit, and though I wouldn't claim to be any sort of an  
expert, I like it a lot: IMHO it's powerful, fast and easy to use. Mind  
you, I came to PHP from an ASP/VBscript background, so anything would  
have been an improvement.


LOL. Yeah.

Having said that, the main gripes I would have with PHP are (i)  
variables aren't strongly typed, which can bite you unless you're  
careful, and (ii) you don't have to declare variables before using them,  
which can also cause trouble - in VBScript you have "option expicit"  
which forces you to declare your variables; I'd like to see something  
similar in PHP.


Apologies if this is off-topic for this list, but I'm curious as to why  
others reckon that PHP sucks.


	It's a matter of mindset. PHP and Postgres have really opposite mindsets.  
Python is a lot more similar to Postgres, for instance :


	- Postgres, Python : strongly typed, throws an error rather than doing  
funny stuff with your data, your code does what you think it should do.
	- PHP : very loosely typed, does whatever it wants, your code will  
surprise you...


To stay on topic :

	- There is no way to know if your persistent connection is a new  
connection (so you might want to setup and issue a few PREPARE statements)  
or an already "second hand" connection.
	- There is no way to reconnect a dead persistent connection (if you  
reboot your database server, you'll have to restart all your  
webservers/PHP processes manually)
	- pg_query() returns everything as strings (python/psycopg2 converts  
postgres types to python types, DATE becomes datetime.datetime, arrays  
too, and you can customize it)

- handling of NULL, 0, '', array(), 0.0 is horrendous
- NO DAMN UNICODE SUPPORT WTF ? (python/psycopg does the right thing)
- register_shutdown_function() isn't always called
- the object model is horrible
	- no closures or lexical scoping, no first class classes or function,  
strings aren't objects

- built by accretion rather than thoughtful design
- etc, etc

	Still, I use it, and it works. It is always poised to bite though. You  
have to wear chain mail pants with PHP.






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] What O/S or hardware feature would be useful for databases?

2007-06-16 Thread PFC


Seems CPU makers currently have more transistors than they know what to  
do with, so they're adding cores and doing a lot of boring stuff like  
SSE2, SSE3, SSE4, etc.


	SSE(n) isn't useless since it speeds up stuff like video encoding by,  
say, a few times.


	For databases, I'd say scatter/gather IO, especially asynchronous  
out-of-order read with a list of blocks to read passed to the OS.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] INSERT ... RETURNING in v8.2

2007-06-16 Thread PFC



Holy Crud!
you mean to tell me I can replace:

insert into table(string) values(('one'),('two'),('three'));
select idx from table where string in ('one','two','three');


Yes.

	A smart ORM library should, when you create a new database object from  
form values, use INSERT RETURNING to grab all the default values (SERIALs,  
DEFAULTs, trigger-generated stuff etc). This is much more elegant than  
digging to find the sequence name to currval() it !


	I think this feature is priceless (but it would be even better if I could  
do INSERT INTO archive (DELETE FROM active WHERE blah RETURNING *)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] allocate chunk of sequence

2007-06-16 Thread PFC


The chunk to be allocated is not the same size, so to set the increment  
value will not help.


	I'm sometimes not that subtle, so I'd just use a BIGINT sequence. Think  
about the largest chunk you'll ever get (probably less than 2^30 rows, yes  
?), set this sequence increment to this very high value (this will still  
let you have 2^32 chunks at least), and make each chunk be  
(big_sequence_value + N).


I'm not sure how the nextval function to handle this internally, if it  
has to read and update the sequence object. Does it use some kind of  
lock ? Otherwise the problem mentioned here should happen to nextval  
function also.


	Yes it takes a lock, but the lock is held for a very short time (just  
increment the value and release the lock).


	In InnoDB, the AUTO_INCREMENT lock is held for the entire duration of the  
SQL statement, including any triggers that might fire. This means if you  
have ON INSERT triggers which take some time to process, INSERT  
concurrency is horrendous. Not so with Postgres sequences.


The lock command does not work for the sequence, so in my example, I  
have to use a dummy table for lock to work.  Another thought is to have  
the nextval function takes an extra argument for the increment value  
(instead of the default increment value).


	If you really really want your ids to be contiguous, you can't use the  
bigint sequence above. So :
	If you do not want to spoil Postgres' good concurrency on inserts, you'll  
want to avoid doing I/O while holding a lock. Burning CPU while holding a  
lock is OK unless you want to use all your cores, but waiting for IO while  
locked is BAD.


So, here's the deal :

- the hard way is to patch nextval() for an extra argument
- the easy way is to write a plpgsql function doing this :
- you have a chunk of N rows to insert.
- get lock
- X := nextval()
- call nextval() (N-1) times in a plpgsql FOR loop
- release lock
	- do your INSERTs, using X ... (X+N-1) as ids (generate_series() is your  
friend)


	Thus you only lock while burning CPU calling nextval(). Not that bad. The  
IO-intensive INSERT isn't under the lock so it can be concurrent.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] persistent db connections in PHP

2007-06-16 Thread PFC

Hello all!

I'm working on a PHP site using Postgres as a back-end. I have an
include at the top of each page that runs the pg_connect function.

I'm implementing some temporary tables, which I understand are
destroyed automatically at the end of the session. It seems to me that


- Temp tables are local to a session (ie. a connection)
	- Persistent connections in PHP creates a pool of connections. When you  
pg_pconnect() you get a connection from the pool. You have no control over  
which connection you get, so you don't really know if the temp tables have  
been created by a previous connection or not.


	So, the best solution is to create the temp tables inside a transaction,  
with ON COMMIT DROP, do your work, and let them disappear. This will save  
you from getting temp tables already filled by stuff from the previous  
webpage which got this connection. For the same reason, pg_pconnect()  
should ALWAYS be followed by pg_query( "ROLLBACK" ). If you put rollback  
in register_shutdown_function(), be aware that it will NOT be called if  
your script is nuked for trying to malloc more than the allowed memory  
limit, or for other brutal reasons...


	Also note that PHP, being PHP, sucks, and thusly, will not reconnect  
persistent connections when they fail. You have to kick it a bit.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Postmaster processes taking all the CPU

2007-06-15 Thread PFC


I promised that I will get back to the group with the reason. Well, of  
course was a query :). I do use a search engine file system  
based(lucene) that will take any desired entity saved into the database  
and find the primary keys and then do a select * from entity where id is  
in (:ids)If I get too many matches(3000-4000)... that will delay my  
postmaster and that postmaster associated with the query would take  
10-15 minutes to process that query. So, now I limit that to 500,  
anything bigger than that will ask user to refine the query.However this  
whole investigation made me observe some things. On my server box 7.4.7  
I have some queries that are executing pretty slow 1.2-1.5secs it's a  
lot for a query that goes thru 5000 records. On my local environment 8.1  
the same queries on similar table size executes much faster like  
200-400ms. Do you know if this is a known issue or my dev box is better  
than my server box?


	PG developers are not Microsoft, hence, every version of PG gets meaner  
and faster instead of slower and more bloated.

The speedup you see might simply be due to developer competence...

	Anyway, if you use Lucene for full text search (I didn't try it, I tried  
Xapian which seems similar) :


	Users don't care about searches returning 5000 results ; they usually  
only look at the first pages. It's the sorting that is important, and this  
should be done by Lucene itself (sorting on phrase relevance, for  
instance, is a search engine job). So, you are right in limiting the  
results to 500, it could even be lower. The role of the search engine is  
to sort the good results at the top of the list so users are happy.


	Do you really display 500 results ? If you paginate the results, the  
users will probably never go past page3. Fetching a lot of results is  
therefore a waste of database power : just fetch the ids from Lucene and  
grab the results from Postgres, but only for the page you are going to  
display.


	If you use Postgres to do the sorting (example : lucene matches the  
records and you want to order them by a field which is not stored in  
Lucene but in Postgres) obviously this advice does not apply.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] pointer to feature comparisons, please

2007-06-14 Thread PFC



The DELETE should block, no?

 Why ?
 Foreign keys put an ON DELETE trigger on the referenced table


Foreign keys that silently, automatic DELETE records?
Did I read that correctly?


Isn't that the point of ON DELETE CASCADE ?

besides checking the referencing column on insert/update... If you just  
implement a constraint, you only get half the functionality.


But when I define a FK *constraint*, that's all I *want*!


	When you add a REFERENCE foo( bar ) foreign key and don't specify ON  
DELETE clause, I believe you get ON DELETE NO ACTION, which means the  
referenced row can't be deleted if there are referencing rows...  so when  
you have a FK constraint, it actually constrains two tables, and pg uses  
share locking to ensure race conditions can't cause an inconsistent  
database.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] inner join problem with temporary tables

2007-06-13 Thread PFC



This message appears:

ERROR: relation "t_arti" does not exist
SQL state: 42P01
Context: SQL function "test1"


Why it does not work???
thanks for your help


	Because plpgsql functions are compiled on first execution and all queries  
are then prepared. All tables are referenced directly in prepared  
statements, not by name. Any prepared statement that refers to dropped  
tables (even dropped temp tables) is thus unfit for consumption.


	This allows queries in plpgsql functions to be extremely fast, but it  
isn't smart enough (yet) to recompile functions when a table the function  
depends on is dropped.


	Just disconnect and reconnect, all prepared plans will be lost, and it  
will work. Or issue your queries directly instead of using a function.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] pointer to feature comparisons, please

2007-06-13 Thread PFC
On Thu, 14 Jun 2007 00:09:20 +0200, Ron Johnson <[EMAIL PROTECTED]>  
wrote:



On 06/13/07 16:59, PFC wrote:

Isn't it *supposed* to mis UNcommitted changes from other transactions?
 Well, if the "uncommited change" is a DELETE of the row that  
allowed the constraint check to pass, then when this delete is  
commited, your data is no longer consistent.


The DELETE should block, no?


Why ?

	Foreign keys put an ON DELETE trigger on the referenced table besides  
checking the referencing column on insert/update... If you just implement  
a constraint, you only get half the functionality.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] pointer to feature comparisons, please

2007-06-13 Thread PFC

Isn't it *supposed* to mis UNcommitted changes from other transactions?


	Well, if the "uncommited change" is a DELETE of the row that allowed the  
constraint check to pass, then when this delete is commited, your data is  
no longer consistent.


Consider this :

CREATE TABLE A( attributes INT[], CHECK( is_valid_attributes( attributes  
)) )


CREATE TABLE valid_attributes ( attribute_id INTEGER )

	You want to check that A.attributes is an array of values, the only  
allowed values being stored in valid_attributes table. If you delete a row  
in valid_attributes, many rows in A can become invalid unless you use some  
form of trigger on valid_attributes which would start to look a lot like a  
foreign key ON DELETE trigger. If you insert stuff in A while concurrently  
deleting a row in valid_attributes, you have problems. This is why foreign  
key checks take share locks on referenced tables...



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Numeric performances

2007-06-04 Thread PFC



If you try it with max() you'd likely get less-surprising answers.


So it was in fact the type conversions that got timed.
Damn. I got outsmarted XDD

Rewind :

CREATE TEMPORARY TABLE test AS SELECT a::FLOAT AS f,  
(a::NUMERIC)*100 AS n, a::INTEGER AS i, a::BIGINT AS b FROM  
generate_series( 1,10 ) AS a;


Max and Sort will use comparisons :

SELECT max(i) FROM test; SELECT max(b) FROM test; SELECT max(f) FROM test;  
SELECT max(n) FROM test;

Temps : 42,132 ms
Temps : 59,499 ms
Temps : 58,808 ms
Temps : 54,197 ms

SELECT * FROM test ORDER BY i LIMIT 1; SELECT * FROM test ORDER BY b LIMIT  
1; SELECT * FROM test ORDER BY f LIMIT 1; SELECT * FROM test ORDER BY n  
LIMIT 1;

Temps : 58,723 ms
Temps : 60,520 ms
Temps : 53,188 ms
Temps : 61,779 ms

SELECT count(*) FROM test a JOIN test b ON (a.i=b.i);
Temps : 275,411 ms

SELECT count(*) FROM test a JOIN test b ON (a.b=b.b);
Temps : 286,132 ms

SELECT count(*) FROM test a JOIN test b ON (a.f=b.f);
Temps : 295,956 ms

SELECT count(*) FROM test a JOIN test b ON (a.n=b.n);
Temps : 321,292 ms

SELECT count(*) FROM test a JOIN test b ON (a.i=b.b);
Temps : 281,162 ms

SELECT count(*) FROM test a JOIN test b ON  
(a.n=b.i::NUMERIC*100);

Temps : 454,706 ms

Now, addition :

SELECT count(i+1) FROM test;
Temps : 46,973 ms

SELECT count(b+1) FROM test;
Temps : 60,027 ms

SELECT count(f+1) FROM test;
Temps : 56,829 ms

SELECT count(n+1) FROM test;
Temps : 103,316 ms

Multiplication :

SELECT count(i*1) FROM test;
Temps : 46,950 ms

SELECT count(b*1) FROM test;
Temps : 58,670 ms

SELECT count(f*1) FROM test;
Temps : 57,058 ms

SELECT count(n*1) FROM test;
Temps : 108,379 ms

SELECT count(i) FROM test;
Temps : 38,351 ms

SELECT count(i/1234) FROM test;
Temps : 48,961 ms

SELECT count(b/1234) FROM test;
Temps : 62,496 ms

SELECT count(n/1234) FROM test;
Temps : 186,674 ms

	Conclusion : numeric seems a bit slower (division being slowest  
obviously) but nothing that can't be swamped by a few disk seeks...





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: NULLS and User Input WAS Re: [GENERAL] multimaster

2007-06-04 Thread PFC



Aaaargh!  No, it doesn't.  It means NULL.  Nothing else.


Well, x = UNKNOWN doesn't make any sense... the answer is UNKNOWN.
	x IS UNKNOWN does make sense, the answer is true or false. Replace  
UNKNOWN with NULL...


	Actually it means what the DBA wants it to mean (which opens the door to  
many a misguided design...)


	I hereby light a candle to the pgsql designers who didn't inflict  
00-00- 00:00:00 upon us besides NULL dates.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Numeric performances

2007-06-04 Thread PFC


This is a 32 bit CPU by the way.

Consider this :

- There are 100K rows
- The CPU executes about 3 billion instructions per second if everything  
is optimum

- "SELECT sum(n) FROM test", takes, say 60 ms

This gives about 1800 CPU ops per row.
A Float addition versus an Int addition is a drop in the sea.

	I believe the marked difference between Floats/Ints (faster) and  
Numeric/Bigint (slower) comes from being passed by value or by pointers.


	A single access which misses the CPU cache and has to go fetch data from  
the real RAM spends a lot more cycles than the simple loops in a NUMERIC  
addition which will hit L1 cache.
	Nowadays cache access patterns matter more than how many actual CPU  
instructions are executed...



forum_bench=> SELECT sum(i) FROM test;
Temps : 46,589 ms
forum_bench=> SELECT sum(f) FROM test;
Temps : 63,865 ms



forum_bench=> SELECT sum(b) FROM test;
Temps : 157,018 ms
forum_bench=> SELECT sum(n) FROM test;
Temps : 124,816 ms


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Numeric performances

2007-06-04 Thread PFC

It is.  But why do you care?  You either have the correctness that
NUMERIC gives, or you don't.


I suspect it's still useful to know what order of magnitude slower it  
is. After all if it is 1000x slower (not saying it is), some people may  
decide it's not worth it or roll their own.


Any hints/gotchas for/when doing such performance tests?



forum_bench=> CREATE TEMPORARY TABLE test AS SELECT a::FLOAT AS f,  
a::NUMERIC AS n, a::INTEGER AS i, a::BIGINT AS b FROM  
generate_series( 1,10 ) AS a;

SELECT
Temps : 1169,125 ms

forum_bench=> SELECT sum(i) FROM test;
Temps : 46,589 ms

forum_bench=> SELECT sum(b) FROM test;
Temps : 157,018 ms

forum_bench=> SELECT sum(f) FROM test;
Temps : 63,865 ms

forum_bench=> SELECT sum(n) FROM test;
Temps : 124,816 ms

SELECT * FROM test ORDER BY i LIMIT 1; SELECT * FROM test ORDER BY b LIMIT  
1; SELECT * FROM test ORDER BY f LIMIT 1; SELECT * FROM test ORDER BY n  
LIMIT 1;

Temps : 68,996 ms
Temps : 68,917 ms
Temps : 62,321 ms
Temps : 71,880 ms

BEGIN; CREATE INDEX test_i ON test(i); CREATE INDEX test_b ON test(b);  
CREATE INDEX test_f ON test(f); CREATE INDEX test_n ON test(n); ROLLBACK;

CREATE INDEX
Temps : 102,901 ms
CREATE INDEX
Temps : 123,406 ms
CREATE INDEX
Temps : 105,255 ms
CREATE INDEX
Temps : 134,468 ms



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] why postgresql over other RDBMS

2007-06-03 Thread PFC


I believe you have made quite a good description of what happens.

	Index-building isn't very CPU-intensive for integers (geometrics and  
tsearch is another matter, of course), so building all indexes of a large  
table in one pass is a possibility that works now, provided you issue all  
create index commands in concurrent connections at roughly the same time.


I don't think pgrestore does this, though.


So it works right now, except it doesn't have (yet) the infrastructure  
to

keep the scans synchronized


Perhaps you only got one read of the table because the process is
essentially self-synchronizing. Whenever one process "gets ahead", it
requires a disk read for the next page, which causes it to block for a
relatively long time, during which time the other two processes either
proceed reading rows from cache, or come to the end of the cache and  
block
waiting for the same page to be read from disk. Obviously not a  
guarantee,
as indexing a relatively more expensive type COULD cause one process to  
get

multiple pages behind, and memory usage by other processes COULD cause
intervening pages to be flushed from cache. But I have a suspicion that  
the
experiment was not just a happy fluke, that there will be a strong  
tendency

for multiple simultaneous index operations to stay sufficiently closely
synch'd that the table will only be read from disk once. (Especially when
such operations are done while the database is otherwise quiescent, as  
would

be the typical case during a restore.)





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: NULLS and User Input WAS Re: [GENERAL] multimaster

2007-06-03 Thread PFC



Yeah, it is awful ;^)  However the existing system is equally awful
because there is no way to enter NULL!


Consider this form :

First name :Edgar
Middle name :   J.
Last name : Hoover

Now, if someone has no middle name, like "John Smith", should we use NULL  
or "" for the middle name ?
NULL usually means "unknown" or "not applicable", so I believe we have to  
use the empty string here. It makes sense to be able to concatenate the  
three parts of the name, without having to put COALESCE() everywhere.


Now consider this form :

City:   
State   :
Country :

If the user doesn't live in the US, "State" makes no sense, so it should  
be NULL, not the empty string. There is no unnamed state. Also, if the  
user does not enter his city name, this does not mean he lives in a city  
whose name is "". So NULL should be used, too.


It is very context-dependent.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] general features

2007-06-03 Thread PFC



1-max amount of available storage data.


How many hard disks can you buy ?


2-Clustering support.


Please be more precise.


3-price.


Free.


4-speed.


Proportional to the expertise of the DBA.
	Postgres can be extremely fast if used correctly, it can totally suck if  
the database is badly designed.
	But this is completely normal. It is as fast as the other major players,  
and all of them need expertise to work well.
	If you're a newbie, you'll make newbie errors, fortunately this list has  
many friendly & knowledgeable people who can help you, and the docs are  
excellent.



THIS EMAIL AND ANY ATTACHED FILES ARE CONFIDENTIAL.


Aw. Come on.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Transactional DDL

2007-06-03 Thread PFC

On Sun, 03 Jun 2007 01:39:11 +0200, Tom Lane <[EMAIL PROTECTED]> wrote:


"Jaime Casanova" <[EMAIL PROTECTED]> writes:

Tom's example seems to show that mysql inserts a commit immidiatelly
after a DDL but this one example shows the thing is worse than that.


Actually, I think their behavior is just "DDL issues a COMMIT", so that
after that you are out of the transaction and the INSERT commits
immediately.  Some experimentation shows that mysql doesn't issue a
warning for rollback-outside-a-transaction, so the lack of any complaint
at the rollback step is just standard mysql-ism.


	Yes, their manual explains this and warns against it. The full list is  
here :


http://www.mysql.org/doc/refman/5.1/en/implicit-commit.html




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Transactional DDL

2007-06-02 Thread PFC


This is what happens in every RDBMS. Whats so special about postgres  
then?


mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE ble ( id INTEGER ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.09 sec)

mysql> INSERT INTO ble VALUES (1),(2),(3);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM ble;
+--+
| id   |
+--+
|1 |
|2 |
|3 |
+--+
3 rows in set (0.00 sec)


psql=> BEGIN;
BEGIN
psql=> CREATE TABLE ble ( id INTEGER );
CREATE TABLE
psql=> INSERT INTO ble VALUES (1),(2),(3);
INSERT 0 3
psql=> ROLLBACK;
ROLLBACK
psql=> SELECT * FROM ble;
ERREUR:  la relation «ble» n'existe pas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Stored Procedures and Functions

2007-06-02 Thread PFC



MySQL supports procedures and functions.

Functions can return results but cannot update the database.
Procedures can update the database but cannot return results.

However :
- a function can call a procedure that updates the database !
- a procedure can return result through OUT parameters !

	It's a friggin mess. In pgsql, if you want, a STABLE or IMMUTABLE  
procedure is a function since it is repeatable : it will always return the  
same results with the same parameters, and has no side-effects. This is  
the definition of a function.


	It is better not to draw useless lines in the ground with huge "don't  
walk over this line" stickers. People will always find a way around.  
Better offer features that users need.




Is it true that postgres doesn't have a notion of Stored Procedures
and functions is what it has instead?
RDBMS like Sql Server supports both stored procedures and functions.
So I was wondering what is the difference between a Stored Procedure
and a function.


Pascal has functions and procedures. C has only functions. That doesn't
say anything about the relative usability of each language. Those are
just names.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] multimaster

2007-06-02 Thread PFC


Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,  
but if it happens to be the one you are in, then it's an option.  I took  
a quick look at the user's page, and both of them were using PostgreSQL.


I just love those Java guys.
The world starts and ends with Java.
	How do you execute a psql script with that thing ? How do you actually  
use any non-java stuff with it ?
	The same features could be implemented in a connection pool like  
pgpool2...




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] why postgresql over other RDBMS

2007-06-02 Thread PFC
On Sat, 02 Jun 2007 00:14:28 +0200, Ron Johnson <[EMAIL PROTECTED]>  
wrote:



On 06/01/07 16:38, PFC wrote:


Will the synchronized seq scan patch be able to do this by  
issuing all  the CREATE INDEX commands at the same time from several  
different database  connections ?

 No, but it could someday.
 Actually I tested, it does it right now, albeit unconsciously (pg  
doesn't do anything to synchronize the scans, but if you launch the  
concurrent connections at the same time and issue all your "create  
index" at the same time, only 1 table scan is needed). Maybe if the  
tables were bigger, it would lose sync between the 3 concurrent scans  
and would end up going slower. That's why I spoke about the  
"synchronized scan" patch.


How much of this, though, is from the OS's disk cache?  Or are Seq Scans  
O_DIRECT and bypass the OS cache?



	Well, the file was larger than disk cache, and I checked in vmstat's  
number of actual bytes read from disks...

Three threads read the table once, One thread reads the table 3 times.

	So it works right now, except it doesn't have (yet) the infrastructure to  
keep the scans synchronized, and psql can't open several connections (yet).


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] why postgresql over other RDBMS

2007-06-01 Thread PFC


	Will the synchronized seq scan patch be able to do this by issuing  
all  the CREATE INDEX commands at the same time from several different  
database  connections ?

 No, but it could someday.


	Actually I tested, it does it right now, albeit unconsciously (pg doesn't  
do anything to synchronize the scans, but if you launch the concurrent  
connections at the same time and issue all your "create index" at the same  
time, only 1 table scan is needed). Maybe if the tables were bigger, it  
would lose sync between the 3 concurrent scans and would end up going  
slower. That's why I spoke about the "synchronized scan" patch.


Or would a "CREATE MANY INDEXES" (where in one statement you specify all  
the indexes on a single table) command be easier to implement?


	You can get the effect right now by using concurrent connections it  
seems. Not very practical in a psql script, though...


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] collision in serial numbers after INSERT?

2007-06-01 Thread PFC



In the last versions of postgres, do :

INSERT INTO blah RETURNING blah_id

	No need to worry about sequences or anything. It inserts, then it returns  
the inserted id, as the name says.


	Very much unlike MySQL where insert_id() returns the id of the last  
insert, even if it was done in an ON INSERT TRIGGER so isn't what you want  
at all !




On Fri, 01 Jun 2007 21:39:49 +0200, Bill Moran <[EMAIL PROTECTED]>  
wrote:



In response to [EMAIL PROTECTED]:


Hello all -

I'm working on a site with PHP and Postgres, coming from a MySQL
background.

I was looking for an equivalent to the mysql_insert_id() function, and
a site recommended this:

function postg_insert_id($tablename, $fieldname)
{
 global connection_id;
 $result=pg_exec($connection_id, "SELECT last_value FROM ${tablename}_
${fieldname}_seq");
 $seq_array=pg_fetch_row($result, 0);
 return $seq_array[0];
}

It relies on pg's sequencing ability.

However, I wondered, if I were in an environment where there were many
concurrent inserts, would it be possible that I didn't get the serial
number of the insert that *I* just did? That if I do an insert, and
several inserts happen after mine, wouldn't I get the id of the latest
row, which is several inserts after mine?


Don't do that.  Please let us know what site recommended that so I can
send an email to the author correcting them.

Instead, do SELECT currval(''), which is guaranteed to be  
isolated

from other sessions.

If you use the code above, sooner or later you're going to get bit.





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] why postgresql over other RDBMS

2007-05-31 Thread PFC

On Thu, 31 May 2007 23:36:32 +0200, PFC <[EMAIL PROTECTED]> wrote:


On Thu, 31 May 2007 22:20:09 +0200, Vivek Khera <[EMAIL PROTECTED]> wrote:



On May 25, 2007, at 5:28 PM, Tom Lane wrote:


That's true at the level of DDL operations, but AFAIK we could
parallelize table-loading and index-creation steps pretty effectively
--- and that's where all the time goes.


I would be happy with parallel builds of the indexes of a given table.   
That way you have just one scan of the whole table to build all its  
indexes.


Just did a test :

- large table (does not fit in RAM)
- rows with text column (forum posts)
- about 700K rows

Time to create 3 indexes : 61 s
Time to create 3 indexes with 3 simultaneous connections : 22 s

That's what you would expect...
	vmstat shows the data is really loaded from disk, once with the 3  
threads, 3 times when indexes are created one at a time.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] why postgresql over other RDBMS

2007-05-31 Thread PFC

On Thu, 31 May 2007 22:20:09 +0200, Vivek Khera <[EMAIL PROTECTED]> wrote:



On May 25, 2007, at 5:28 PM, Tom Lane wrote:


That's true at the level of DDL operations, but AFAIK we could
parallelize table-loading and index-creation steps pretty effectively
--- and that's where all the time goes.


I would be happy with parallel builds of the indexes of a given table.   
That way you have just one scan of the whole table to build all its  
indexes.


	Will the synchronized seq scan patch be able to do this by issuing all  
the CREATE INDEX commands at the same time from several different database  
connections ?


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Geographic data sources, queries and questions

2007-05-30 Thread PFC




In short, you have to update every instance of the key, not only in the
database, but in every application and even in every other  
representation in
the real world. That could include changing people's bookmarks, notes in  
PDAs,
even paper reports sitting on people's desks -- a tall order for an SQL  
query.


	This also applies to misguided databases that REUSE values from  
"auto_increment" columns.

I once had two orders with the same PK value.

	One had been mistakenly deleted, then another one took its place, and all  
hell broke loose.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] SELECT from mytbl;

2007-05-29 Thread PFC
On Wed, 30 May 2007 05:24:57 +0200, Erwin Brandstetter  
<[EMAIL PROTECTED]> wrote:



On May 30, 2:11 am, Rodrigo De León <[EMAIL PROTECTED]> wrote:
(... useful code example snipped)


Now see:

http://www.postgresql.org/docs/8.2/static/plpgsql-control-structures


Thanks for your hints, Rodrigo!
I am aware I can consult pg_catalog / information_schema to retrieve
the information, but that's not quite the "short way" I was hoping
for. Your example is still helpful, though. :)



Python example :

import psycopg2
db = psycopg2.connect( host='/var/run/postgresql', user='annonces',  
database='annonces' )

cursor = db.cursor()

bad = set(('email','website','comment'))

cursor.execute( "SELECT * FROM contacts LIMIT 0" )

fields = [d[0] for d in cursor.description]
print fields
['id', 'name', 'person', 'telephone', 'address', 'zipcode', 'city',  
'fax', 'email', 'website', 'comment', 'group_id', 'name_search',  
'address_search']


print "SELECT "+(','.join([d for d in fields if d not in bad]))+" FROM  
contacts"
SELECT  
id,name,person,telephone,address,zipcode,city,fax,group_id,name_search,address_search  
FROM contacts


print "SELECT " + (','.join(set(fields).difference(bad)))+ " FROM contacts"
SELECT  
city,fax,name_search,name,zipcode,telephone,person,address_search,address,group_id,id  
FROM contacts











Regards
Erwin




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] optimisation for a table with frequently used query

2007-05-29 Thread PFC



again thanks - bit of a noob question I know, but it's good to
learn :-)


Well not really since the answer is quite subtle...

You kave two columns A and B.
Say you have index on A, and index on B.
These queries will make direct use of the index :
A=... or any range on A (BETWEEN <, >, <=, >= etc )
B=... or any range on B (BETWEEN <, >, <=, >= etc )

	Now if you ask for (A=... AND B=...) or ask for (A=... OR B=...) one  
index cannot be used, so postgres uses a bitmap scan to combine the  
indexes (read the docs). It is slightly slower than a direct index scan,  
but still much faster than not using indexes at all.

If you had an index on A,B it would have been used directly.

	If one of the two indexes has very poor selectivity (like just a few  
different values), bitmap scan will not be optimal. If your indexes have  
lots of different values, it will be about as fast as a real index.


	An index on A,B can also do WHERE A=... ORDER BY A,B without actually  
doing the sort (it will pick the rows in index order), which is nice for  
many things, like getting blog comments in order.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] problems with SELECT query results

2007-05-29 Thread PFC
SELECT 'PV.LINEITEM:' || partnum || ',' || round(onhand) || ',' ||  
round(qm5) || ',' || round(lsm4) || ',' || ',' || ',' || round(onorder)  
|| ',' || ',' || ',' || binone || ',' || ',' || round(backorderqty) ||  
',' || ',' || round(onhold) || ',' || ',' || ',' || ',' || ',' || ',' ||  
',' || round(qtyperjob) || ',' || round(ordermax) AS gmrim FROM slparts  
WHERE vendor LIKE 'CH%'


You could select columns and build the string in your application ?

The query does work and I am getting results from the database. There  
are values for all 'partnum' in the database, however, the query results  
include blank fields here and there in between the returned records. Why  
am I receiving blank fields for 'gmrim' This absolutely defies logic


	Because one of your fields is probably NULL, and NULL || anything stays  
NULL.


	You have probably been misled to believe they are "blanks" because they  
don't display as "NULL" but as "".

I set psql to display NULL as NULL.

	If these columns can, must, or should not contain NULLs depends on your  
application... it's for you to chose.
	Use COALESCE, add NOT NULL constraints, grab the columns and build the  
string in your application, you chose.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Languages and Functions

2007-05-29 Thread PFC



2. Is there any performance or other advantage to using PL/pgsql over
Pl/Perl or Python?


	Yes, if you want to loop over large amounts of data (FOR row IN SELECT)  
plpgsql will be faster since it does not have to convert the data from  
postgres to python/perl format.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] optimisation for a table with frequently used query

2007-05-28 Thread PFC




SELECT * from my_table where id_1 = x and id_2 = y;

Neither id_1 or id_2 or the combination of them is unique. I expect
this table to become large over time.


Create an index on (id_1, id_2), or (id_2,id_1).

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] table referencing several others

2007-05-28 Thread PFC


Here what I would do :

	- Have a table "objects", which contains locations, shows, persons, well  
anything you want to relate to the forums in question.

- Simply add an object_id to your forums table.

	Now, how do you distinguish between different types of objects in your  
objects table ?
	You can add an object_type field and teach your object-relational mapper  
that it expresses the class of object that should be instantiated when  
fetching the rows.
	You can have fields that are only used by some types of objects, and null  
for others.

You can use table inheritance.
	You can also have an objects table, and tables locations_extra_infos,  
people_extra_info which store the fields that are unique to each type of  
object.



- every subsequent message of the thread will carry these useless fields
  (only the first message in thread uses one of them),


	If the fields are set to NULL, they will only use one bit, so this is not  
a problem.
	Do not store the location_id in each message of the thread : if you want  
to change it, you'll have to update all the messages !


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] why postgresql over other RDBMS

2007-05-26 Thread PFC




MySQL has a related problem, which is that they have embedded IPv4
addressing rather deeply into their client authentication logic (by
making userids be [EMAIL PROTECTED] not just a username).  This is probably why
they still haven't got IPv6 support:
http://bugs.mysql.com/bug.php?id=8836
I wonder what their plans are for fixing that ...

regards, tom lane


	Well, I have now almost finished my spare-time forum benchmark of MySQL  
versus Postgres...
	It was a really interesting experience, and by turning a few stones I  
discovered some really nasty stuff about MySQL.. well, about InnoDB  
really. Here's one that you probably didn't know about :


	- auto_increment isn't concurrent, this means all inserts into a table  
which has an auto_increment column are serialized

- yes this means INSERT INTO SELECT locks out all other inserts
	- the lock is held during ON INSERT TRIGGERS, which means all INSERT  
TRIGGERs on a given table can't execute concurrently


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] index vs. seq scan choice?

2007-05-25 Thread PFC


Would it be possible to look at a much larger number of samples during  
analyze,

then look at the variation in those to generate a reasonable number of
pg_statistic "samples" to represent our estimate of the actual  
distribution?

More datapoints for tables where the planner might benefit from it, fewer
where it wouldn't.


	Maybe it would be possible to take note somewhere of the percentage of  
occurence of the most common value (in the OP's case, about 3%), in which  
case a quick decision can be taken to use the index without even looking  
at the value, if we know the most common one is below the index use  
threshold...


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] why postgresql over other RDBMS

2007-05-24 Thread PFC



Indeed. Wouldn't it be a cool feature to persists transaction states
across connections so that a new connection could get access to a sub-
transaction state? That way, you could make your schema changes and
test them with any number of test clients (which designate the state
to connect with) and then you would commit when everything works.


	Actually you can hack this by starting your webserver with only 1 thread,  
use persistent connections, and disable all commits in the application.

But I'd call that "a very hackish hack".

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Integrity on large sites

2007-05-24 Thread PFC


Flickr uses InnoDB, by the way.



On Thu, 24 May 2007 18:07:21 +0200, Joshua D. Drake <[EMAIL PROTECTED]>  
wrote:



Dave Page wrote:


problem with your setup. Granted, MySQL is a pretty bad database, but
it's not *that* bad -- your example implies that heavily MyISAM-based
(you don't say whether this is MyISAM or InnoDB) sites such as
Slashdot and Flickr should be falling over every hour.

 I'm not going to comment on who's fault it is, but the OP quoted 100
updates and 600 selects per *second*. I can't imagine Flickr or Slashdot
(which is heavily csched for reading) are under anything like that sort
of constant load.


Uhmmm I would not be surprised at *all* at slashdot or flickr doing  
that type of velocity. We have customers right now that under peak are  
doing 10 times that and yes these are customers that have similar types  
of websites.


Sincerely,

Joshua D. Drake


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Integrity on large sites

2007-05-23 Thread PFC




What version of that pathetic RDBMS is this?


MySQL 5.0.40, on gentoo Linux, Core 2 Duo.

	The table in question takes about 100 inserts/deletes and 600 selects per  
second.
	MyISAM isn't able to finish the benchmark. Actually, I have to run REPAIR  
TABLE every 20 minutes, since it corrupts.



I find it hard to believe that this is MySQL's fault and not some
problem with your setup.


Yeah, me too.
Is it a MyISAM bug, a gentoo bug, a hardware bug ? Who knows.
Go into bugs.mysql.com and search for "corrupt" or "corruption"
Postgres, InnoDB and memtest86 are perfectly happy on this machine.

	However, I have sometimes upgraded MySQL on websites, and found it to  
crash repeatedly, then had to downgrade it.


I submitted this one, consider it my pet bug :
http://bugs.mysql.com/bug.php?id=28534



- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGVLLGS9HxQb37XmcRAgpiAJ4nHpdGXL5HFdosWvkIy16CEyXiSwCgjqtB
qYgCmePqgZkGCpdJ/JAFLoE=
=P7OR
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Integrity on large sites

2007-05-23 Thread PFC



Some big sites do of course juggle performance vs in-database run-time
checks, but the statements as typically presented by MySQL partisans,


Live from the front :

	This freshly created database has had to endure a multithreaded query  
assault for about 2 hours.

It gave up.

TABLE `posts` (
  `post_id` int(11) NOT NULL auto_increment,
  `topic_id` int(11) NOT NULL,
etc...

mysql> SELECT max(post_id) FROM posts;
+--+
| max(post_id) |
+--+
|   591257 |
+--+

mysql> INSERT INTO posts (topic_id,post_text,user_id) VALUES (1,'DIE  
BASTARD',666);

ERROR 1062 (23000): Duplicate entry '591257' for key 1

mysql> CHECK TABLE posts;
+---+---+--+-+
| Table | Op| Msg_type | Msg_text|
+---+---+--+-+
| forum_bench.posts | check | warning  | Table is marked as crashed  |
| forum_bench.posts | check | error| Found 588137 keys of 588135 |
| forum_bench.posts | check | error| Corrupt |
+---+---+--+-+

mysql> REPAIR TABLE posts;
+---++--+--+
| Table | Op | Msg_type | Msg_text |
+---++--+--+
| forum_bench.posts | repair | status   | OK   |
+---++--+--+

mysql> INSERT INTO posts (topic_id,post_text,user_id) VALUES (1,'DIE  
BASTARD',666);

Query OK, 1 row affected, 1 warning (0.10 sec)

mysql> SHOW WARNINGS;
+-+--++
| Level   | Code | Message|
+-+--++
| Warning | 1364 | Field 'post_time' doesn't have a default value |
+-+--++

mysql> SELECT max(post_id) FROM posts;
+--+
| max(post_id) |
+--+
|   591257 |
+--+

mysql> SELECT count(*) FROM posts UNION ALL SELECT sum( topic_post_count )  
FROM topics;

+--+
| count(*) |
+--+
|   588137 |
|   588145 |
+--+

mysql> SELECT count(*) FROM topics WHERE topic_id NOT IN (SELECT topic_id  
FROM posts);

+--+
| count(*) |
+--+
|11583 |
+--+

(Note : there cannot be a topic without a post in it, ha !)

Try Postgres :

forum_bench=> SELECT count(*) FROM posts UNION ALL SELECT  
sum( topic_post_count ) FROM topics;

 count

 536108
 536108
(2 lignes)

forum_bench=> SELECT count(*) FROM topics WHERE topic_id NOT IN (SELECT  
topic_id FROM posts);

 count
---
 0
(1 ligne)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Integrity on large sites

2007-05-23 Thread PFC




"*Really* big sites don't ever have referential
integrity. Or if the few spots they do (like with
financial transactions) it's implemented on the

application level (via, say, optimistic locking), never the
database level."


	Sure, but in the forum benchmark I just did, when using MyISAM, with no  
reference integrity checks, at the end of the benchmark, there is an  
impressive number of records with broken foreign key relations... when the  
user kills his HTTP connection or reloads at the wrong moment, and the  
script is interrupted, or killed by an exception or whatever, boom.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Using a trigger with an object-relational manager

2007-05-23 Thread PFC
On Wed, 23 May 2007 14:41:00 +0200, Rick Schumeyer <[EMAIL PROTECTED]>  
wrote:


Actually, the situation is slightly more complicated.  It's more like I  
have tables A1, A2, and A3 each of which must have a corresponding row  
in B.  So each of A1, A2 and A3 has a BEFORE INSERT trigger that creates  
a row in B and sets a FK in A1 (or A2 or A3).  So I can't just use the  
same PK in both the A tables and B.


	It's a hack, but A1, A2, A3 etc could all use the same sequence to  
generate their PK...


The best would be to patch rails so it uses INSERT RETURNING.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Using a trigger with an object-relational manager

2007-05-22 Thread PFC


I'm using Ruby on Rails and have two tables, A and B.  Every row in A  
needs a corresponding row in B.  A also contains a FK pointing to B.


I created a before insert trigger on A that inserts a new row in B, and  
sets the FK in A.  This seems to be running fine.


So, A has a b_id field linking to B ?

	If you need a 1-1 relationship, you could try instead to use the same  
primary key in B than in A :


A : id SERIAL PRIMARY KEY
B : id INTEGER PRIMARY KEY (not SERIAL)

Then, AFTER INSERT trigger on A checks the value the sequence put in A.id  
and inserts in B with this value as the PK.


	Postgres has INSERT ... RETURNING which is a very clean and elegant  
solution but Rails never heard about it...





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Permance issues with migrated db

2007-05-22 Thread PFC



I thought the same thing. While I'm not a MS SQL guru, I put 'TOP 100
PERCENT' after SELECT in the query. The Enterprise Manager does not
indicate how many rows come back. I save it as a VIEW in MS SQL and do a
'select count(*)...' and, yes, it comes back 42164877 records.


	No, it comes back 1 record with the count in it, the ORDER BY is useless  
for a count(*), etc.


What is it that you are trying to do exactly ?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Schema sanity check

2007-05-21 Thread PFC



The other option is to have a column on the mailbox table to flag that
it is a default_mailbox -- but then I'd have to ensure there's only
one column for each "person" flagged that way.


- is_default BOOL column in mailbox table
- conditional index :

UNIQUE INDEX ON mailboxes( owner ) WHERE is_default = 't'


Second question.  So, after a while the obvious problem happens and
users have too many mailboxes and they want a way to group them into


Looks like a tree. Why not use a LTREE ?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] feature suggestions

2007-05-21 Thread PFC


	Which list is the most appropriate for proposing features and ideas for  
postgres ?


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Are foreign key's automatically set as indicies?

2007-05-19 Thread PFC

On Wed, 16 May 2007 06:09:15 +0200, camb <[EMAIL PROTECTED]> wrote:


Hey all,
I'm sorta new to PG and am just wondering, when I create a foreign key
is it automatically set as an indexed column or do I need to also set
that manually?


Primary key creates unique index.
	Foreign keys do not create any indexes, it is up to you to decide if you  
need indexes or not.


Say :

CREATE TABLE items ( ... category_id REFERENCES categories( id ) )

	if you want cascaded deletes/updates to your items table to be fast, or  
you need it for other reasons, create an index on category_id.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] 8.0, UTF8, and CLIENT_ENCODING

2007-05-17 Thread PFC



I have a small database (PgSQL 8.0, database encoding UTF8) that folks  
are inserting into via a web form. The form itself is declared  
ISO-8859-1 and the prior to inserting any data, pg_client_encoding is  
set to LATIN1.


Wouldn't it be simpler to have the browser submit the form in utf8 ?

Most of the high-bit characters are correctly translated from LATIN1 to  
UTF8. So for e-accent-egu I see the two-byte UTF8 value in the database.


Sometimes, in their wisdom, people cut'n'paste information out of MSWord


Argh.

and put that in the form. Instead of being mapped to 2-byte UTF8  
high-bit equivalents, they are going into the database directly as  
one-byte values > 127. That is, as illegal UTF8 values.


Sometimes you also get HTML entities in the mix. Who knows.
All my web forms are UTF-8 back to back, it just works. Was I lucky ?
	Normally postgres rejects illegal UTF8 values, you wouldn't be able to  
insert them...


When I try to dump'n'restore this database into PgSQL 8.2, my data can't  
made the transit.


Firstly, is this "kinda sorta" encoding handling expected in 8.0, or did  
I do something wrong?


	Duh ? pg isn't supposed to accept bad unicode data... something  
suspicious is going on.
	Besides, if it was dumped, it should be reloadable... did pg_dump use a  
funky encoding ?


Secondly, anyone know any useful tools to pipe a stream through to strip  
out illegal UTF8 bytes, so I can pipe my dump through that rather than  
hand editing it?


	Yes, use iconv (see man page), it can do this for you quite easily. It's  
probably already installed on your system.
	Be warned, though, that illegal multibyte characters eat quotes at night  
while you aren't looking... unterminated strings are a pain.
	You could also load your database with C locale, and have a script select  
from the records you wish to convert, and update the rows.
	Python has very good Unicode support, should be easy to make such a  
script.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Fault Tolerant Postgresql (two machines, two postmasters, one disk array)

2007-05-17 Thread PFC



Aren't there PCI heartbeat cards that are independent of the load on
the host machine?


	But, if the machine is fork-bombed, or drowning in swap, or generally  
slowly committing suicide, it's not shall we say "available" anymore, so  
you might want to finish it off...


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Transaction commit in a trigger function

2007-05-17 Thread PFC




I can't wrap BEGIN/COMMIT around the INSERT in the trigger.  Is there
another way of achieving this?


- Savepoints (won't work with your trigger approach)
- dblink would allow you to open another connection concurrently

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Paypal and "going root"

2007-05-17 Thread PFC



Has anybody pondered this and come up with anything?


	Well, I've done e-commerce stuff although it was with banks, not paypal.  
There are a few levels of safety :


- Level 0 : Total stupidity (osCommerce)
	Bank redirects to your site. Order is marked as paid with "paid=1" in the  
URL.

Solution : fire the programmer

- Level 1 : callback
	Bank hits URL on your site with data you provided (order_id) so you may  
mark the order as paid.
	Technical documentation says to verify the transaction by checking the  
bank server's IP address.

Solution : choose another bank

- Level 3 : crypto
	Bank hits URL on your site with data you provided (order_id) so you may  
mark the order as paid.
	Bank encrypts everything and securely signs the data (HMAC,  
certificate...)

This works well.

	Additionally, you can hit a URL on the bank to check validity. However,  
if this is not encrypted (SSL...) can you be sure who you are talking to ?


	My advice is to store, in your table, the signed and encrypted data the  
bank sends you.
	Suppose a hacker breaks your verification code and marks his order as  
paid in the database. Even so, he will never be able to use the bank's  
certificate to generate a fake signed payment confirmation to insert in  
your log table. And, usually, the data the bank signs includes the user  
name, id, order id, amount, and user-specified fields, that you use to  
store a SHA1 hash of the list of things the user ordered, so he can't  
steal the payment confirmation from another user, or from  a past order.
	So, if later you suspect something, you can re-check the signatures on  
the payment confirmation data, and the fraud will be apparent.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-16 Thread PFC
We use bitfields on our large user table.  It is becoming unworkable to  
scan for matches, since overall most people have very few selections  
made.


We are moving it to a model like your favorite_colors table which just  
links the option and the user.


We find that doing joins on large tables which can be indexed to avoid  
full table scans are very fast in postgres, since the index can do much  
of your culling of potential matching rows.  With bitfields, you are  
more or less forced into doing a sequence scan to find everyone who  
likes the color red.


Of course, if you're playing with only a few thousand users, either  
approach works well.


Things you could try :

* Use an integer array instead of a bitfield
	(for instance, in users table, column favourites would contain { 1,2 } if  
the user selected items 1 and 2 )
	Then, you can make a Gist index on it and use the indexed intersection  
operator


	This is likely the optimal solution if the maximum number of items is  
small (say, 100 is good, 10 is not)


* keep your bitfields and create conditional indexes :

CREATE INDEX ... WHERE bitfield_column & 1;
CREATE INDEX ... WHERE bitfield_column & 2;
CREATE INDEX ... WHERE bitfield_column & 4;
CREATE INDEX ... WHERE bitfield_column & 8;
CREATE INDEX ... WHERE bitfield_column & 16;
etc...

	Obviously this will only work if you have, say, 10 favouritess. 100  
indexes on a table would really suck.


Then, when looking for users who chose bits 1 and 2, do :
SELECT WHERE (bitfield_column & 1) AND (bitfield_column & 2)
postgres will do a bitmap-and using the two indexes

(note : when we get bitmap indexes, this will be even better)

* use tsearch2 :

favourites = 'red blue'
and fulltext-search it

* use a favourites table :

	This makes queries hard to optimize. Consider the table (user_id,  
item_id) meaning user selected this item as favourite.
	If you want to know which users did select both items 1 and 2, you have  
to do a self-join, something like :


SELECT... FROM favourites a, favourites b WHERE a.user_id = b.user_id AND  
a.item_id=1 AND b.item_id = 2


	This is likely to be "not very fast" if 1 million users check each option  
but only 100 check both.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread PFC



Thus, if there are a whole bunch of columns on each table, the data in
those extra columns (e.g. - all columns aside from "id", the one that
was asked for in the result set) will indeed be drawn into memory.


Yeah, I wanted to mean that ;)
	All the columns are loaded (except the TOASTed ones which are not  
mentioned in the query) into memory, but only the requested ones are  
processed and returned to the client...



Is that specific to Postgresql?


Nope. All databases do more or less the same.


From an outside perspective it just
seems odd that potentially a large amount of data would be pulled off
disk into memory that is never used.  Perhaps there's an overriding
reason for this.


Yeah, where would you put this data if you didn't put it where it is 
now ?


If you alter tables "customer" and "order", taking some columns off,
and stowing them in separate tables, then you'll find that more tuples
of "customer" and "order" will fit into a buffer page, and that the
join will be assembled with somewhat less memory usage.

Whether or not that is a worthwhile change to make will vary
considerably.


Makes designing the schema a bit tough. ;)


"Premature optimization is the root of all evil"

Build a test database, fill it with data, and experiment.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread PFC



SELECT o.id
FROM order o
JOIN customer c on o.customer = c.id

Does that bring into memory all columns from both order and customer?
Maybe that's not a good example due to indexes.


	No, it just pulls the columns you ask from the table, nothing less,  
nothing more.


	Splitting tables (vertical partitioning) is used to reduce the size of  
the working set that has to fit in RAM... this is a different reason than  
what you're thinking about.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread PFC




Say, if you first SELECT fname, lname FROM user_table;
and then you issue SELECT * FROM user_table; -- the
second select will be returned from buffer cache -- since
all rows are already in the cache.


	...Unless your table contains some large TEXT columns that have been  
stored out of line (TOASTed) by postgres.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread PFC



Say I have a "user" table that has first, last, email, password, and
last_accessed columns.  This user table will be accessed often.  (It's
not really "user", but that's not important in this discussion)

Say that there's also about 10 columns of settings or preferences for
each user.  Are there any cases or reasons to have a separate
"user_preferences" table vs. just placing all the columns together in
one table?


I did something like that on MySQL some time ago.
	In the Users table there was stuff that other users need to see (like his  
login name, etc), and stuff that only this user needs to see (like his  
preferences).
	So, when displaying posts in the forum, for instance, only a small part  
of the fields in the Users table was needed, the rest was just dead  
weight, that made the table unable to fit in RAM.

So I split the table, and it was faster.

However, you can also buy more RAM...

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Views- Advantages and Disadvantages

2007-05-10 Thread PFC

On Fri, 11 May 2007 04:24:55 +0200, Tom Lane <[EMAIL PROTECTED]> wrote:


"Leif B. Kristensen" <[EMAIL PROTECTED]> writes:

Would it be reasonable to suggest that later versions of PostgreSQL
could examine if a function changes data, and quietly marks a function
as 'stable' if it doesn't?


My instinctive CS-major reply to that is "only if you've found a
solution to the halting problem".  However, it's possible that we could
detect this case for a useful subset of real-world functions ... not
sure offhand what could be covered.

regards, tom lane


	Why not simply have PG issue a warning if the user doesn't specify one of  
("stable", "immutable", etc) on function creation ?

like :

WARNING: Function marked as Volatile by default
INFO: if the function does not modify the database, you might want to mark  
it STABLE or IMMUTABLE to improve performance


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] In theory question

2007-05-09 Thread PFC



This may be a question for -hackers, but I don't like disturbing them
unnecessarily.
I've been having a look at memcached. I would like to ask, is there any
reason that, theoretically, a similar caching system could be built
right into the db serving daemon?
I.e., the hash tables and libevent could sit on top of postmaster as an
optional component caching data on a per-query basis and only hitting
the actual db in the event of a cache miss?


I think this is close to what MySQL's query cache does. The question is  
if this should be the job of the DBMS and not another layer. At least  
the pgmemcache author and I think that it's better done outside the  
DBMS. See http://people.FreeBSD.org/~seanc/pgmemcache/pgmemcache.pdf  
for the idea.


I have always found MySQL's query cache to be utterly useless.

Think about it this way :

It only works for tables that seldom change.
	It does not work for big tables (like the posts table of a forum) because  
the cache would have to be huge.


	So, the most frequent usage of MySQL's query cache is for dumb  
applications who use, for instance, PHP, store their configuration in  
MySQL, and reload it on each and every page with a SELECT * FROM  
configuration.


	In this case, you save the query time, but you don't save : the roundtrip  
between PHP and the database, extracting query results, building objects  
from them, time spent in ORMs, etc.


	A much better solution is to do your own caching, for instance using  
shared memory in the application server, and then you cache native  
language objects. You not only save the query time, but also all the time  
spent building those objects on every page load.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] typical schema for a forum?

2007-05-09 Thread PFC


	I have written a little PHP+postgres forum for benchmarking purposes, to  
see how fast postgres could go.
	It has basic forum features, like forums (duh), topics, posting,  
pagination, watching topics, topic & post count, display newest topic and  
post in topic & forum pages, templates, topics pagination in O(1) not  
O(number of pages), etc.
	I also wrote a companion benchmark script which tortures the server with  
HTTP requests, viewing and posting.


	it answered my "how fast can it go" question : more than 500 dynamic  
pages/second on a Core 2. Most of the load is PHP : postgres could handle  
much, much more than this.


	If people are interested in this as a benchmark, I can write some docs  
(and make a MySQL version).

If you want to use it as a base for your forum, you're welcome.

Who wants the source ?

On Wed, 09 May 2007 10:49:17 +0200, Louis-David Mitterrand  
<[EMAIL PROTECTED]> wrote:



Hi,

I'm trying to implement a forum with mason and postgresql. What is the
typical database schema of a forum (threaded or flat) application?

Thanks,

---(end of broadcast)---
TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Slow query and indexes...

2007-05-08 Thread PFC



Thanks for a good answer, I'll try to find a workaround. The number of
data_loggers will change, but not to frequently. I was actually hoping
to make a view showing the latest data for each logger, maybe I can
manage that with a stored procedure thingy...


	- Create a table which contains your list of loggers (since it's good  
normalization anyway, you probably have it already) and have your data  
table's logger_id REFERENCE it
	- You now have a simple way to get the list of loggers (just select from  
the loggers table which will contain 3 rows)

- Then, to get the most recent record for each logger_id, you do :

SELECT l.logger_id, (SELECT id FROM data d WHERE d.logger_id = l.logger_id  
ORDER BY d.logger_id DESC, d.date_time DESC LIMIT 1) AS last_record_id  
FROM loggers l


2 minute example :

forum_bench=> CREATE TABLE loggers (id SERIAL PRIMARY KEY, name TEXT );
CREATE TABLE

forum_bench=> INSERT INTO loggers (name) VALUES ('logger 1'),('logger  
2'),('logger 3');

INSERT 0 3

forum_bench=> CREATE TABLE data (id SERIAL PRIMARY KEY, logger_id INTEGER  
NOT NULL REFERENCES loggers( id ));

CREATE TABLE

forum_bench=> INSERT INTO data (logger_id) SELECT 1+floor(random()*3) FROM  
generate_series(1,100);


forum_bench=> SELECT logger_id, count(*) FROM data GROUP BY logger_id;
 logger_id | count
---+
 3 | 333058
 2 | 333278
 1 | 333664


NOTE : I use id rather than timestamp to get the last one

forum_bench=> EXPLAIN ANALYZE SELECT logger_id, max(id) FROM data GROUP BY  
logger_id;

 QUERY PLAN
-
 HashAggregate  (cost=19166.82..19169.32 rows=200 width=8) (actual  
time=1642.556..1642.558 rows=3 loops=1)
   ->  Seq Scan on data  (cost=0.00..14411.88 rows=950988 width=8) (actual  
time=0.028..503.308 rows=100 loops=1)

 Total runtime: 1642.610 ms

forum_bench=> CREATE INDEX data_by_logger ON data (logger_id, id);
CREATE INDEX

forum_bench=> EXPLAIN ANALYZE SELECT l.id, (SELECT d.id FROM data d WHERE  
d.logger_id=l.id ORDER BY d.logger_id DESC, d.id DESC LIMIT 1) FROM  
loggers l;
 QUERY  
PLAN

-
 Seq Scan on loggers l  (cost=0.00..3128.51 rows=1160 width=4) (actual  
time=0.044..0.074 rows=3 loops=1)

   SubPlan
 ->  Limit  (cost=0.00..2.68 rows=1 width=8) (actual time=0.020..0.020  
rows=1 loops=3)
   ->  Index Scan Backward using data_by_logger on data d   
(cost=0.00..13391.86 rows=5000 width=8) (actual time=0.018..0.018 rows=1  
loops=3)

 Index Cond: (logger_id = $0)
 Total runtime: 0.113 ms
(6 lignes)

forum_bench=> SELECT l.id, (SELECT d.id FROM data d WHERE d.logger_id=l.id  
ORDER BY d.logger_id DESC, d.id DESC LIMIT 1) FROM loggers l;

 id | ?column?
+--
  1 |   99
  2 |  100
  3 |   90
(3 lignes)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid

2006-05-09 Thread PFC



2b- LARGE UPS because HDs are the components that have the higher power
consomption (a 700VA UPS gives me about 10-12 minutes on a machine
with a XP2200+, 1GB RAM and a 40GB HD, however this fall to..
less than 25 secondes with seven HDs ! all ATA),


	I got my hands on a (free) 1400 VA APC rackmount UPS ; the batteries were  
dead so I stuck two car batteries in. It can power my computer (Athlon 64,  
7 drives) for more than 2 hours...
	It looks ugly though. I wouldn't put this in a server rack, but for my  
home PC it's perfect. It has saved my work many times...


	Harddisks suck in about 15 watts each, but draw large current spikes on  
seeking, so the VA rating of the UPS is important. I guess in your case,  
the batteries have enough charge left; but the current capability of the  
UPS is exceeded.



Some hardware ctrlrs are able to avoid the loss of a disk if you turn
to have some faulty sectors (by relocating internally them); software
RAID doesn't as sectors *must* be @ the same (linear) addresses.


	Harddisks do transparent remapping now... linux soft raid can rewrite bad  
sectors with good data and the disk will remap the faulty sector to a good  
one.




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] [SQL] input from a external text file......!

2006-03-11 Thread PFC


inside psql, type :

\i filename



On Sat, 11 Mar 2006 11:29:20 +0100, AKHILESH GUPTA  
<[EMAIL PROTECTED]> wrote:



Hi All.!
I just want to know one thing that is it possible with PGSQL that,
if I want to insert and execute a query from a external text file  
instead of

giving it at the pgsql prompt?
just like in Oracle the file having query is executed with a '@ filename'
statement at the sql prompt..!
plz help me and mail me @ [EMAIL PROTECTED], it's urgent.
thanks in advance...!
(i have searched alot, but didn't found anything)

--
Thanks & Regards,
Akhilesh
DAV Institute of Management
Faridabad(Haryana)
GSM:-(+919891606064)
   (+911744293789)

"FAILURES CAN BE FORGIVEN BUT AIMING LOW IS A CRIME"




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-12 Thread PFC

You need a newer pgadmin --- pg_database.datpath went away in 8.0.
I'm installing the new version. Thanks.
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-12 Thread PFC
pgadmin3 does this (from pg's log with level set at debug5):
INSTRUCTION :  SELECT db.oid, datname, datpath, datallowconn, datconfig,  
datacl, pg_encoding_to_char(encoding) AS serverencoding,  
pg_get_userbyid(datdba) AS datowner,has_database_privilege(db.oid,  
'CREATE') as cancreate
  FROM pg_database db
 ORDER BY datname
ERREUR:  42703: la colonne <> n'existe pas
EMPLACEMENT :  transformColumnRef, parse_expr.c:1099

	Do you know if this is normal, should this column exist, is it a problem  
with pgadmin ?
	Thanks !

On Mon, 11 Apr 2005 18:42:29 +0200, Tom Lane <[EMAIL PROTECTED]> wrote:
PFC <[EMAIL PROTECTED]> writes:
	I have no idea what to type in gbd to get the trace, though
What I usually do is
- start a psql session
- in another window, find out the PID of the backend attached
  to the psql session, and do
$ gdb /path/to/postgres backend_PID
...
gdb> b errfinish
gdb> cont
- go back to psql session and issue problem command
- when gdb stops execution, do
gdb> bt
... useful printout is here ...
gdb> quit
sure you want to exit? y
regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-12 Thread PFC
	Sometimes life has an irony of itself.
	Today I modified some of my gentoo USE flags for something totally  
unrelated to postgres.
	Tonight I built postgres in debug mode : the offending query worked.
	I thught "hm."
	I rebuilt it without debug, and it still works.
	I don't know what made it NOT to work before, I sure didn't hallucinate.  
It must be some obscure incompatibility deep inside the Gentoo package and  
build manager...
	phppgadmin works, and pgadmin doesn't, telling me the "datapath" column  
doesn't exist, no idea what this means. I'm going to look into it.

Thanks for your help and sorry about bothering you !

On Mon, 11 Apr 2005 18:42:29 +0200, Tom Lane <[EMAIL PROTECTED]> wrote:
PFC <[EMAIL PROTECTED]> writes:
	I have no idea what to type in gbd to get the trace, though
What I usually do is
- start a psql session
- in another window, find out the PID of the backend attached
  to the psql session, and do
$ gdb /path/to/postgres backend_PID
...
gdb> b errfinish
gdb> cont
- go back to psql session and issue problem command
- when gdb stops execution, do
gdb> bt
... useful printout is here ...
gdb> quit
sure you want to exit? y
regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-11 Thread PFC

If you want, you can try building with --enable-debug and getting a gdb
traceback from the call to errfinish().  That would at least give us
some clue where in the rule text it's getting confused.
Is this :
./configure --enable-debug
?
I have no idea what to type in gbd to get the trace, though
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-11 Thread PFC

You might try comparing the results of
  select ev_action from pg_rewrite where ev_class = 'pg_user'::regclass;
from the two builds.
regards, tom lane
Well...
	I'll spare your eyes and tell you right away that the results are  
identical... they're at the bottom of the email.

I tried recompiling postgres 8.0.1-r2 with :
CFLAGS="-O2 -pipe -march=pentium3"
which is quite conservative (I have a Pentium-M, this is my dev laptop).
$ gcc --version
gcc (GCC) 3.3.5  (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)
And I still get (in 8.0.1-r2) :
test=> SELECT pdb.datname AS datname, pu.usename AS datowner,  
pg_encoding_to_char(encoding) AS datencoding,
test-> (SELECT description FROM pg_description pd WHERE pdb.oid=pd.objoid)  
AS datcomment,
test-> (SELECT spcname FROM pg_catalog.pg_tablespace pt WHERE  
pt.oid=pdb.dattablespace) AS tablespace
test-> FROM pg_database pdb, pg_user pu
test-> WHERE pdb.datdba = pu.usesysid AND NOT pdb.datistemplate
test-> ORDER BY pdb.datname;
ERREUR:  unexpected right parenthesis

Let's try something else :
postgresql $ createuser -P peufeu
Entrez le mot de passe pour le nouvel utilisateur :
Entrez-le de nouveau :
Le nouvel utilisateur a-t'il le droit de créer des bases de données ?  
(y/n) o
(note that I have to type "o" and not "y" as it's in French... although it  
still displays y/n, "y" means "no" ! it bit me !)
Le nouvel utilisateur a-t'il le droit de créer des utilisateurs ? (y/n) n
Mot de passe :
CREATE USER

peufeu $ createdb
Mot de passe :
CREATE DATABASE
[EMAIL PROTECTED] peufeu $ psql
Mot de passe :
Bienvenue dans psql 8.0.1, l'interface interactive de PostgreSQL.
Tapez:  \copyright pour les termes de distribution
\h pour l'aide-mémoire sur les commandes SQL
\? pour l'aide-mémoire sur les commandes internes
\g ou terminez avec un point-virgule pour exécuter une requête
\q pour quitter
peufeu=> SELECT pdb.datname AS datname, pu.usename AS datowner,  
pg_encoding_to_char(encoding) AS datencoding,
peufeu-> (SELECT description FROM pg_description pd WHERE  
pdb.oid=pd.objoid) AS datcomment,
peufeu-> (SELECT spcname FROM pg_catalog.pg_tablespace pt WHERE  
pt.oid=pdb.dattablespace) AS tablespace
peufeu-> FROM pg_database pdb, pg_user pu
peufeu-> WHERE pdb.datdba = pu.usesysid AND NOT pdb.datistemplate
peufeu-> ORDER BY pdb.datname;
ERREUR:  unexpected right parenthesis

ka-blam.
	I checked there was no invisible UTF-8 character inside the query...  
there isn't, everything's normal...
	I cut bits of the query, trying to get to the bug, and to my surprise, I  
got :

peufeu=> SELECT pdb.datname FROM pg_database pdb, pg_user pu;
ERREUR:  unexpected right parenthesis
Which is interesting, as there is no parenthesis in this query ;)
I'll copypaste the rest of the session :

peufeu=> SELECT pdb.datname FROM pg_database pdb, pg_user pux;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT pdb.datname FROM pg_database pdb, pg_user pu;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT pdb.datname FROM pg_database pdb, pg_user;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT foo.datname FROM pg_database foo, pg_user;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT foo.datname FROM pg_database foo;
  datname
---
 test
 peufeu
 template1
 template0
(4 lignes)
peufeu=> SELECT foo.datname FROM pg_user, pg_database foo;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT * FROM pg_user LIMIT 1;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT * FROM pg_user LIMIT 1;
pg_user
peufeu=> SELECT * FROM pg_user LIMIT 1;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT * FROM pg_user;
ERREUR:  unexpected right parenthesis
peufeu=> SELECT 1 FROM pg_user;
ERREUR:  unexpected right parenthesis
Seems I have a cursed table.
I login as postgres :
$ psql -U postgres peufeu
Mot de passe :
Bienvenue dans psql 8.0.1, l'interface interactive de PostgreSQL.
Tapez:  \copyright pour les termes de distribution
\h pour l'aide-mémoire sur les commandes SQL
\? pour l'aide-mémoire sur les commandes internes
\g ou terminez avec un point-virgule pour exécuter une requête
\q pour quitter
peufeu=# SELECT 1 FROM pg_user;
ERREUR:  unexpected right parenthesis
peufeu=# ANALYZE VERBOSE pg_user;
ERREUR:  unexpected right parenthesis
peufeu=# ANALYZE VERBOSE "pg_user";
ERREUR:  unexpected right parenthesis
[EMAIL PROTECTED] peufeu $ echo "ANALYZE VERBOSE" | psql -U postgres peufeu 
2>log
Mot de passe :
ANALYZE
[EMAIL PROTECTED] peufeu $ grep pg_catalog log
INFO:  Analyse de "pg_catalog.pg_shadow"
INFO:  Analyse de "pg_catalog.pg_opclass"
INFO:  Analyse de "pg_catalog.pg_am"
INFO:  Analyse de "pg_catalog.pg_amop"
etc...
[EMAIL PROTECTED] peufeu $ grep user log
[EMAIL PROTECTED] peufeu $
I'm totally perplexed. Have you got any idea ?
I'm itching to do an initdb...
8.0.0 :
 ({QUERY :commandType 1 :quer

Re: [GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-11 Thread PFC

Hmm; the only occurrence of that string in the sources is in
nodes/read.c, which AFAIK would never be invoked for a query entered
straight from the client.  What could trigger it would be trying to
read a rule that is misformatted for some reason.  Maybe you have
a damaged pg_user view definition --- does "select * from pg_user"
fail the same way?
	I installed pg 8.0.0 over 8.0.1-r3 (without touching the data directory)  
and then it worked again, so I don't think it has to do with the data  
being broken. I didn't do createdb or anything. I used gentoo emerge.

	If you need it for debug, I can reinstall 8.0.1-r3 and see if it still  
fails.
	Can you get phppgadmin to work on this latest version ?

Thank you,
P.F. Caillaud

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[GENERAL] pg 8.0.1-r3 killls pgadmin3 and phppgadmin

2005-04-10 Thread PFC
Hello,
	Just installed pg 8.0.1-r3 and now phppgadmin and pgadmin3 die with a  
cryptic error message.
	I've traced it to the following query to get information about the  
database :

SELECT pdb.datname AS datname, pu.usename AS datowner,  
pg_encoding_to_char(encoding) AS datencoding,
(SELECT description FROM pg_description pd WHERE pdb.oid=pd.objoid) AS  
datcomment,
(SELECT spcname FROM pg_catalog.pg_tablespace pt WHERE  
pt.oid=pdb.dattablespace) AS tablespace
FROM pg_database pdb, pg_user pu
WHERE pdb.datdba = pu.usesysid AND NOT pdb.datistemplate
ORDER BY pdb.datname;

	This query looks right to me and pg 8.0.0 does it fine, while 8.0.1-r3  
says :

Error : Unexpected right parenthesis.
This is not very helpful...
What's going on ?
I've downgraded to 8.0.0 in the meantime.
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [GENERAL] basic pg lock question

2005-02-03 Thread PFC

To use this to prevent simultaneous inserts of the same data (for  
example if two employees try to insert the same contact into the DB), I  
suppose you could use a constraint (before insert) that checks that  
there is no data matching the md5 checksum, right?

CREATE TABLE blah
(
mymd5   TEXT NOT NULL,
UNIQUE( mymd5 )
);
will do this for you automatically and create an index to enforce it  
(don'tcreate another index !)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [GENERAL] Does indexing help >= as well as = for integer columns?

2005-02-01 Thread PFC

This I don't get.  Why is an index scan not used?  Isn't an index  
supposed
to help when using > < >= <= too?
It should !
Explain Analyze Select count(smiles) from structure where _c >= 30
Aggregate  (cost=196033.74..196033.74 rows=1 width=32) (actual  
time=42133.432..42133.434 rows=1
loops=1)
  ->  Seq Scan on structure  (cost=0.00..191619.56 rows=1765669  
width=32) (actual
time=8050.437..42117.062 rows=1569 loops=1)
Filter: (_c >= 30)
Total runtime: 42133.746 ms

See these :
->  Index Scan using "Nc" on structure  (cost=0.00..105528.89 rows=26486  
width=32) (actualtime=0.098..16.095 rows=734 loops=1)
->  Seq Scan on structure  (cost=0.00..191619.56 rows=1765669 width=32)  
(actual time=8050.437..42117.062 rows=1569 loops=1)

	In the index scan case, Planner thinks it'll get "rows=26486" but in  
reality only gets 734 rows.
	In the seq scan case, Planner thinks it'll get "rows=1765669" but in  
reality only gets 1569 rows.

	The two are way off-mark. 26486 still makes it choose an index scan  
because it's a small fraction of the table, but 1765669 is not.

	Analyze, use more precise statistics (alter table set statistics),  
whatever... but you gotta get the planner correctly estimating these  
rowcounts.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Howto determin the number of elemnts of an array

2005-01-31 Thread PFC
contrib/intarray
On Mon, 31 Jan 2005 20:00:06 +0100, Együd Csaba <[EMAIL PROTECTED]> wrote:
Hi,
this function is part of GiST which is not installed on my server.  
Should I
install the whole GiST or there is a single package to implement only  
this
feature?

thank you,
-- Csaba
-Original Message-
From: PFC [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 5:16 PM
To: Együd Csaba; 'Sven Willenberger'; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Howto determin the number of elemnts of an array
icount( array )

Hi Seven,
it would be better for me if it returned the number of elements as an
integer even if the array is empty (in this case it returns with NULL).
No
metter this is easily can be worked around, but it would have been
more confortable.
Thank you very much.
-- Csaba Együd
-Original Message-
From: Sven Willenberger [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 4:27 PM
To: pgsql-general@postgresql.org
Cc: Együd Csaba
Subject: Re: [GENERAL] Howto determin the number of elemnts of an
array

Együd Csaba wrote:
Hi,
how can I determin the number of elements of an array? I saw function
array_dims() which returns a string value representing the dimensions
of the array. Is there a function which returns only the number of
elements as an integer.
-- Csaba
array_upper(arrayname,dimension) will return the number of elements in
the specified dimension of array arrayname.
Sven


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Howto determin the number of elemnts of an array

2005-01-31 Thread PFC
icount( array )

Hi Seven,
it would be better for me if it returned the number of elements as an
integer even if the array is empty (in this case it returns with NULL).  
No
metter this is easily can be worked around, but it would have been more
confortable.

Thank you very much.
-- Csaba Együd
-Original Message-
From: Sven Willenberger [mailto:[EMAIL PROTECTED]
Sent: Monday, January 31, 2005 4:27 PM
To: pgsql-general@postgresql.org
Cc: Együd Csaba
Subject: Re: [GENERAL] Howto determin the number of elemnts of an array

Együd Csaba wrote:
Hi,
how can I determin the number of elements of an array? I saw function
array_dims() which returns a string value representing the dimensions
of the array. Is there a function which returns only the number of
elements as an integer.
-- Csaba
array_upper(arrayname,dimension) will return the number of elements in  
the
specified dimension of array arrayname.

Sven

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [GENERAL] Splitting queries across servers

2005-01-29 Thread PFC
postgresql-8.0.0beta4 $ time make -j 5
   ... lots of output ...
real0m41.274s
user1m36.315s
sys 0m15.451s
Yikes.
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Splitting queries across servers

2005-01-29 Thread PFC

I have never heard of Propolice SSP. What is it ? Any relation to the  
honey
'Propolys'. just kidding.

Max
The name says little although I like it.
http://www.gentoo.org/proj/en/hardened/
	I was out of date -- Propolice has been renamed PaX.
	The hardened project has many parts, you should read the help on  
grsecurity, but PaX is very interesting :

---
from http://www.gentoo.org/proj/en/hardened/docs/pax-howto.xml :
What is PaX?
PaX is a patch to the Linux kernel that provides hardening in two ways.
The first, ASLR (Address Space Layout Randomization) provides a means to  
randomize the addressing scheme of all data loaded into memory. When an  
application is built as a PIE (Position Independent Executable), PaX is  
able to also randomize the addresses of the application base in addition.

The second protection provided by PaX is non-executable memory. This  
prevents a common form of attack where executable code is inserted into  
memory by an attacker. More information on PaX can be found throughout  
this guide, but the homepage can be found at http://pax.grsecurity.net.

At run time, when a buffer is created, SSP adds a secret random value, the  
canary, to the end of the buffer. When the function returns, SSP makes  
sure that the canary is still intact. If an attacker were to perform a  
buffer overflow, he would overwrite this value and trigger that stack  
smashing handler.
---
	For instance, imagine you have a version of Samba with the latest  
unpatched hole. An attacker can enter. Now if you have PaX all he can do  
is crash the process, and his intrusion attempt is detected and logged.

It's not the final cure for everything, but it covers unpatched holes.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] Splitting queries across servers

2005-01-29 Thread PFC
Thanks for reminding me that 64bit translates to: recompile everything  
you
need!

I think this is exactly the choice of configuration we are going to make.
Someone just reminded me that windows and linux come down to the same
performance, but that the real overhead is on maintenance. It's true that
linux implies less downtime, and you can always upgrade. On top of that,
there's more granularity control on linux.
	I won't say anything on linux vs windows performance (although I sure do  
have an opinion), but simply keep in mind that postgres on window is young  
and postgres on linux is vary mature ; therefore one is likely to have a  
lot more performance refinements than the other.

	Also Linux is not that harder to administer (especially gentoo). After  
all, all you want is a DB server, you'll install few software. Not like if  
you needed 50 different apps.

	In your case recompiling everything is also a bonus because you'll be  
sure that everything is 64-bit optimized down to the last driver.
	I wonder how long your killer machine will take to compile the whole OS.  
And Postgres too... could you post that ?

	If you are concerned about security, you can compile everything with the  
stack protector (Propolice SSP) enabled. It's extremely advisable. All the  
other options like grsec et al are nice but require careful thinking,  
propolice just needs to be activated. If you don't know what it is I'll  
post docs.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Allowing update of column only from trigger

2005-01-28 Thread PFC
	First you should use a ltree type for the uri field :
- you write it foo.bar instead of /foo/bar
- there are operators on ltree types to express "is parent of", "is  
children of"
- these operators are indexed

Check the readme :
http://www.sai.msu.su/~megera/postgres/gist/ltree/
	If you have this type of path, I guess you'll often make tree traversal  
operations, and that you'll find the ltree operators extremely useful.
	You can update it with a trigger just like before.

	Now about your update problem, when you rename foo.bar into foo.crum.bar  
you could, in a single update, replace all foo.bar by foo.crum.bar in all  
your table with the ltree operators and special functions.

	And for your checks, you can add a CHECK on the url field to be sure it's  
equal to the url of the parent + the name of the current row. It'll make  
one more SELECT request, though.

I have a table like this:
create table objects (
id  serial primary key,
namevarchar not null,
parent  integer references objects(id) default 1 not null,
uri varchar not null
) without oids;
The uri column is a denormalization for performance, storing a "path" to  
the object in the hierarchy, consisting of a sequence of names.

# select id, name, parent, uri from objects;
  id | name | parent |   uri
+--++--
   1 |  |  1 | /
   2 | foo  |  1 | /foo
   3 | bar  |  2 | /foo/bar
(3 rows)
The uri is calculated by a trigger before update on objects.
The original version of the trigger function would re-calculate the uri  
for an object and its immediate children if the name, parent, or uri  
changed. It would apply the uri change to the children, which would  
cascade down the hierarchy. This generally worked, but (1) I was having  
data visibility voodoo, and (2) it was calculating every child's uri  
twice, which would be inefficient for large hierarchies.

So I changed the trigger function so that it would only fire if name or  
parent had changed. I created a recursive helper function that changes  
the uri for all descendants of an object, if the object's name or parent  
has changed. There is no cascade of changes (the trigger fires for all  
the descendants, of course, but doesn't do anything). Works great, is  
more efficient, and I can manage the max_stack_size to fit the size of  
the hierarchy.

The PROBLEM with this is that anyone can now
# update objects set uri='/ha/ha/your/screwed' where id=2;
I want the trigger function and its helper alone to be able to update  
the uri.

What is the best way to do this? Should I put the uri column in a  
separate table, and play with permissions? (Yuck.) Do I need to bite the  
bullet, go back to cascading triggers, and work out the data visibility  
voodoo and the efficiency issue? Or is there a better way that I haven't  
thought of?

Thanks,
Shawn Harrison

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] How to get back the current count of a auto incrementing number.

2005-01-28 Thread PFC
	This question pops about once a day...
	can't the list server grep for "auto increment" and send a link to the  
docs itself ;)


hi
 How to get the current value of a auto
incrementing number generated using CREATE SEQUENCE
seqname START 1
Is there any way and if yes what is it. Can we run a
select query and may be a inbuilt function which will
return the current value of that number?
Regards.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] My postmaster just crashed !

2005-01-27 Thread PFC
	I'm sorry, I forgot to say I had the bug with 8rc3, then installed 8.0.0  
and it is still crashing.

On Thu, 27 Jan 2005 10:52:37 -0700, Michael Fuhr <[EMAIL PROTECTED]> wrote:
[I've Cc'ed pgsql-bugs and set the Reply-To header to that list.]
On Thu, Jan 27, 2005 at 05:26:26PM +0100, PFC wrote:
It seems that contrib/intagg crashes my server :
-
select int_agg_final_array(1);
server closed the connection unexpectedly
I see the same thing with PostgreSQL 8.0.0 (REL8_0_STABLE) on Solaris 9
and FreeBSD 4.11.  Here's part of the backtrace from the core dump on
FreeBSD:
(gdb) bt
#0  0x285077f4 in ShrinkPGArray (p=0x1) at int_aggregate.c:130
#1  0x28507929 in int_agg_final_array (fcinfo=0xbfbfeb90) at  
int_aggregate.c:177
#2  0x8106db8 in ExecMakeFunctionResult (fcache=0x83410a0,  
econtext=0x83411e8,
isNull=0xbfbfecef "\b0ÃÂÂÂK\024\bÂz3\bÃ\003", isDone=0x0) at  
execQual.c:1042
#3  0x8107583 in ExecEvalFunc (fcache=0x83410a0, econtext=0x83411e8,
isNull=0xbfbfecef "\b0ÃÂÂÂK\024\bÂz3\bÃ\003", isDone=0x0) at  
execQual.c:1459
#4  0x8108beb in ExecEvalExprSwitchContext (expression=0x83410a0,  
econtext=0x83411e8,
isNull=0xbfbfecef "\b0ÃÂÂÂK\024\bÂz3\bÃ\003", isDone=0x0) at  
execQual.c:2781
#5  0x8145353 in evaluate_expr (expr=0x8337ab8, result_type=1007) at  
clauses.c:2399


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] visualizing B-tree index coverage

2005-01-27 Thread PFC

I realize that using OR will not result in an index scan.
I will never be interested in a OR condition for the kinds
of searches I use.  In my Select statements, I always name
every column of the multi-column index in same order that
they were named when creating the index.  I always use
the >= condition, and very rarely, the = condition.
All the leftmost index column must be named, but the order is 
unimportant.
You can use (a BETWEEN x AND y) instead of (a>=x AND a<=y), it is 
cleaner.
However, I am concerned that I must place
the most selective column first in my index.  I cannot tell,
a priori, which column will be most selective.  That depends on the
nature of search, which can vary widely each time.
Are you saying that if my first column is not selective, even though the  
remaining
columns are, the planner may choose not to use the index after
seeing that the first column is not very selective?
	I thought this was true but made some tests and the index scanner is  
smart.

	Try this :
CREATE TABLE test (id serial primary key, a INTEGER, z INTEGER, e INTEGER,  
r INTEGER, t INTEGER, y INTEGER ) WITHOUT OIDS;
INSERT 1M rows into table using a plpgsql function, with a,z,e,r,t,y being  
floor(random()*10) for instance.

	Then you can try various selects. a,z,e,r,t,y are a linear distribution  
between 0 and 9 included, so :
a>=A AND z>=Z ... y>=Y gives a result set of about  
(10-A)*(10-Z)*...*(10-Y) results. You'll see the planner will use an index  
scan when needed. You can try the easiest case (a>=9) which just explores  
one part of the tree, and the worst case which explores a part of all  
leafs (y>=9). Both should yield about the same number of results, but the  
first should be faster. To know how much, just try ;)

That seems like an oversight, IMHO.  Shouldn't the overall effect of
using all the columns be considered before choosing not to use an
index scan?
I think it is. There are no cross column correlation stats though.
Since I'm using every column of my multi-column index for every search,
and I always use >=, Explain Analyze always shows that every column
is considered in the index scan.  However, that is only when the
index scan is used.  Sometimes, Explain Analyze shows it is not used.
That appears to happen when my search condition is very general.
This it to be expected, so I am not worried.  Most of my searches will
be intermediate, namely not VERY selective, but also not VERY general.
So the idea of the multi-column index is to "characterize" each row
sufficiently, even when it is a perfectly ordinary row with no ONE
feature being distinctive, but rather several features together giving
it it's distinctive character.  That is my interpretation of the
multi-column index.
	If you have some features which are highly selective, you can create a  
single column index on them. It won't be used often, but when it will, it  
will really work.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[GENERAL] My postmaster just crashed !

2005-01-27 Thread PFC
It seems that contrib/intagg crashes my server :
-
select int_agg_final_array(1);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
-
Here's the logfile :
TRACE:  le systeme de bases de donnees est pret
TRACE:  processus serveur (PID 26882) a ete arrete par le signal 11
TRACE:  Arret des autres processus serveur actifs
FATAL:  Le systeme de bases de donnees est en cours de restauration
TRACE:  Tous les processus serveur se sont arretes, reinitialisation
TRACE:  le systeme de bases de donnees a ete interrompu a 2005-01-27  
17:22:48 CET
TRACE:  l'enregistrement du point de verification est a 1/F3854A94
TRACE:  re-execution de l'enregistrement a 1/F3854A94 ; l'annulation de  
l'enregistrement est a 0/0 ; arret TRUE
TRACE:  prochain identifiant de transaction : 5800 ; prochain OID : 4533584
TRACE:  le systeme de bases de donnees n'a pas ete arrete proprement ;  
restauration automatique en cours
TRACE:  enregistrement de longueur nulle sur 1/F3854AD0
TRACE:  la re-execution n'est pas requise
TRACE:  le systeme de bases de donnees est pret

What do you think ?
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] self-join on subselect

2005-01-27 Thread PFC
Thanks, I was hoping there was some way to avoid it.
You have to write the subselect twice if you want to match up different  
rows.
In some cases it might be worth making a copy in a temp table. For simple
subselects where there is an index on id, leaving it as is should work  
fine.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Extended unit

2005-01-26 Thread PFC
I wonder if it makes sense to implement the units as separate data types
? Cause that's what they are really.
So "amper" would be a data type which aliases one of the numeric data
types (depending on what precision range you need), but does not allow
to be added with anything else than "amper". Any other interaction with
other units (read data types) would be achieved by defining the needed
operators on the respective data types (read units).
	You'd have to create a postgres datatype for every variation on m, m/s,  
m/s², etc... which would be kinda unworkable... I think it's better to  
have one datatype (number with unit) and have the operators raise an  
exception when trying to add incompatible units ?

	As for the encoding, why not just use a (float, text) with the text as a  
parseable representation of the unit, which could as well be the SI unit  
(like m/s) which would be a lot more flexible than bitfields. Problem is I  
think it'll always be variable length. Maybe there is enough space in an  
int64 to fit it all ? Maybe with huffman coding ? Is it really important  
to save a few bytes ? I don't think so.
	For table columns, the type would be saved in the column definition so  
you'd just have a float anyway.
	I think it's an exciting project !

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] visualizing B-tree index coverage

2005-01-26 Thread PFC
	I think you missed an important "feature" of multicolumn indexes, that  
you better not use 'OR' in your expressions. You seem to want only to use  
'>=' so this should be OK.

	Suppose you have 3 columns a,z,e containing values linearly distributed  
between ...

select min(a),max(a),min(z),max(z),min(e),max(e) from test;
 min | max | min | max | min | max
-+-+-+-+-+-
   0 |  13 |   0 |  99 |   0 |  99
For instance the following query is indexed :
explain analyze select * from test where a>=0 and z>=90 and e>=0;
   QUERY PLAN
-
 Index Scan using testa on test  (cost=0.00..1637.56 rows=11345 width=16)  
(actual time=0.085..51.441 rows=13000 loops=1)
   Index Cond: ((a >= 0) AND (z >= 90) AND (e >= 0))
 Total runtime: 56.307 ms

The following is only partially indexed :
explain analyze select * from test where (a=1 or a=2) and (z=1 or z=8) and  
e>=0;
 QUERY PLAN

 Index Scan using testa, testa on test  (cost=0.00..3269.06 rows=346  
width=16) (actual time=0.328..52.961 rows=400 loops=1)
   Index Cond: ((a = 1) OR (a = 2))
   Filter: (((z = 1) OR (z = 8)) AND (e >= 0))
 Total runtime: 53.297 ms

	You see the 'index cond' field which is what determines the fetched rows,  
which are then fetched and filtered with the 'filter' expression. Having  
the most selective index cond is important because it will diminish the  
number of rows to be fetched. However, in your case the filter expression  
is also beneficial because any row eliminated by the filter will not need  
to go through your expensive matching function.

In this case :
SELECT count(*) FROM test;
=> 131072
SELECT count(*) FROM test WHERE ((a = 1) OR (a = 2));
=> 2
SELECT count(*) FROM test WHERE (a=1 or a=2) and (z=1 or z=8) and e>=0;
=> 400
In this case the index fetches 20k rows out of 131072 but only 400 are  
used...

If you don't use OR, index use is more likely :
explain analyze select * from test where (a,z,e) >= (0,50,80);
   QUERY PLAN
-
 Index Scan using testa on test  (cost=0.00..1669.78 rows=12627 width=16)  
(actual time=0.087..58.316 rows=13000 loops=1)
   Index Cond: ((a >= 0) AND (z >= 50) AND (e >= 80))
 Total runtime: 63.049 ms

Here you have a full index scan.
To determine the efficiency of your indexes, you can thus use this method,  
and look at the 'index cond' and 'filter' expressions, and counting the  
rows matched by each.











particular number of columns
for indexing.  I don't want to use too many, nor too few columns.  I also
want to optimize the nature(which atom types, bond types, etc.)
of the count columns.  While I could do this
and use the speedup as the measure of success, I think
that if my B-tree were "covering" the data well, I would get the best  
results.
Covering means finding that optimal situation where there is not one  
index for all rows
and also not a unique index for every row - something inbetween would be  
ideal,
or is that basically a wrong idea?
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Recursive queries

2005-01-26 Thread PFC
Check out ltree
http://www.sai.msu.su/~megera/postgres/gist/ltree/
On Tue, 25 Jan 2005 22:03:58 +0100, tmp <[EMAIL PROTECTED]> wrote:
I don't think anybody has written the syntactic sugar, but someone did
write a function that provides equivalent output.
I think it is important that the funcionality lies in the database
engine itself: In that way it can more efficiently make use of the
optimizer.
Also, I think this "recursive" feature is *the* most important upcoming
improvements: Currently there are simply no efficient way of fetching
linked structures, which however is quite common in many areas.
Regards
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if  
your
  joining column's datatypes do not match


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Extended unit

2005-01-26 Thread PFC

	If you allow multiplication and division, you'd need to store not only  
one type, but an expression like m.s^-2, etc. You'll end up with something  
with Maple. Isn't there some free open source algebraic computation  
toolkit with equations and units somewhere ?

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] on update / on delete performance of foreign keys

2005-01-26 Thread PFC
	It's a bit more complicated than that as there are also locking issues,  
like what if other processes insert rows while some others are being  
deleted, really the whole thing isn't trivial.


Since postgres already incoporates code to check foreign keys more
efficiently (when doing alter table ... add constraint .. foreign key,
postgres seems to use a merge or a hash join, instead of a nested loop),
I wondered how hard it would be to use this for the triggers too.
I imagined creating a statement-level trigger in parallel to the
row-level triggers, and defining some threshold (let's say, more than
10% of the rows deleted). If the threshold is reached, the row-level
trigger would just do nothing, and the statement-level trigger would
delete the referencing records doing a join.
Would this be feasable? And would it be something a newbie could tackle,
or is it more involved than I think?
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Calculating a moving average

2005-01-26 Thread PFC
	Make a plpgsql function which will iterate over the rows on which the  
moving average is to be done (FOR row IN SELECT), of course use the  
correct order, then use an array as a FIFO, add a row to the moving  
average and push it, pop the old one and substract it.
	Roundoff errors will bite your nether regions.

	I spose the purpose of this is not to fetch the whole thing so that your  
moving average will not have one result row per source row (or you'd do it  
in the application), thus you can pre-shrink your dataset by putting some  
avg() and group by in your source select.


Hi,
I need to calculate a moving average and I would like to do it with SQL,
or a Pg function built for this purpose. I'm on Pg 7.4. Is this possible
in Pg without a bunch of self joins, or is there a funtion available?
Thanks,
Mike

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


  1   2   >