Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-10 Thread Pierre-Frdric Caillaud
On Mon, 10 Jan 2005 12:46:01 -0500, Alex Turner [EMAIL PROTECTED] wrote: You sir are correct! You can't use perl in MS-SQL or Oracle ;). Can you benefit from the luminous power of Visual Basic as a pl in MSSQL ? ---(end of broadcast)--- TIP 6:

Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-10 Thread Pierre-Frdric Caillaud
The .NET Runtime will be a part of the next MS SQLServer engine. You will be able to have C# as a pl in the database engine with the next version of MSSQL. That certainly will be something to think about. Ah, well, if it's C# (or even VB.NET) then it's serious ! I thought postgres

Re: [PERFORM] Low Performance for big hospital server ..

2005-01-03 Thread Pierre-Frdric Caillaud
Decrease the sort mem too much [8196] make the performance much slower so I use sort_mem = 16384 and leave effective cache to the same value , the result is quite better but I should wait for tomorrow morning [official hour] to see the end result. You could also profile your queries to see

Re: [PERFORM] LIMIT causes SEQSCAN in subselect

2004-12-23 Thread Pierre-Frdric Caillaud
The fact that the estimator knows that the LIMIT is pointless because there are less rows in the subselect than the LIMIT will return is not something we want to count on; sometimes the estimator has innaccurate information. The UNIQUE index makes this more certain, except that I'm not

Re: [PERFORM] Using LIMIT changes index used by planner

2004-12-23 Thread Pierre-Frdric Caillaud
On Mon, 13 Dec 2004 17:43:07 -0500, Tom Lane [EMAIL PROTECTED] wrote: Sven Willenberger [EMAIL PROTECTED] writes: explain analyze select storelocation,order_number from custacct where referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07 12:00:00' order by custacctid limit 10;

Re: [PERFORM] Caching of Queries

2004-12-23 Thread Pierre-Frdric Caillaud
I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). You don't use persistent connections ??? Your problem might simply be the connection time overhead (also including a

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frdric Caillaud
How many rows do the following queries return : select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frdric Caillaud
Just wanted to know the selectivity of the accountnumber and routingNumber columns. I shoulda written : How many rows do the following queries return : One or few at most, or a lot ? select userID from bankaccount WHERE accountnumber = '12345678' select userID

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frdric Caillaud
Just One, user can i have only one bankaccount. Ah well, in that case : This is your query : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frdric Caillaud
Your suffering comes from the where ba.bankaccountID = u.bankaccountID in the subselect. It means postgres has to run the subselect once for each row in Users. You want the subselect to run only once, and return one (or more?) bankaccountid's, then fetch the users from Users. Just

Re: [PERFORM] Data type to use for primary key

2004-11-23 Thread Pierre-Frdric Caillaud
All, Well, you should still escape any strings you're getting from a web page so you can ensure you're not subject to a SQL insert attack, even if you're expecting integers. Thanks, Peter Darley Well, your framework should do this for you : integer specified in your database object class

Re: [PERFORM] memcached and PostgreSQL

2004-11-22 Thread Pierre-Frdric Caillaud
While an exception, this is a very real possibility in day to day operations. The absence of any feedback or balancing mechanism between the database and cache makes it impossible to know that they are in sync and even a small error percentage multiplied over time will lead to an ever

Re: [PERFORM] Data type to use for primary key

2004-11-22 Thread Pierre-Frdric Caillaud
What is the common approach? Should I use directly the product_code as my ID, or use a sequantial number for speed? (I did the same for the company_id, this is a 'serial' and not the shor name of the customer. I just don't know what is usually done. Use a serial : - you can change product_code

Re: [PERFORM] scalability issues on win32

2004-11-22 Thread Pierre-Frdric Caillaud
Test platform: Pentium 4 3.06 GHz/HT 10k SATA Raptor 1Gb memory Windows XP Pro SP2/Redhat Fedora 3 (64 bit results coming soon) Could you please add information about... - filesystems ? - windows configured as network server or as desktop box ? - virtual memory In my experience you MUST

Re: [PERFORM] Index usage for sorted query

2004-11-20 Thread Pierre-Frdric Caillaud
Instead of : WHERE cd='ca' ORDER BY l_postcode; Write : WHERE cd='ca' ORDER BY cd, l_postcode; You have a multicolumn index, so you should specify a multicolumn sort exactly the same as your index, and the planner will get it. ---(end of

Re: [PERFORM] vacuum analyze slows sql query

2004-11-08 Thread Pierre-Frdric Caillaud
Lets say for a second that you manage to trick it into using index scan, and then you actually call the function with one of the values that returns 1,000s of rows. Probably it will take 10-100 times longer than if it used a seq scan. I don't know if it matters (I suspect that it does) but I am

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frdric Caillaud
Myself, I like a small Apache with few modules serving static files (no dynamic content, no db connections), and with a mod_proxy on a special path directed to another Apache which generates the dynamic pages (few processes, persistent connections...) You get the best of both, static files

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frdric Caillaud
On Thu, 4 Nov 2004 18:20:18 -, Matt Clark [EMAIL PROTECTED] wrote: Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frdric Caillaud
I'm guessing (2) - PG doesn't give the results of a query in a stream. In 1- I was thinking about a cursor... but I think his problem is more like 2- In that case one can either code a special purpose server or use the following hack : In your webpage include an iframe with a

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frdric Caillaud
check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore

Re: [PERFORM] preloading indexes

2004-11-03 Thread Pierre-Frdric Caillaud
-- uh, you can always load a table in cache by doing a seq scan on it... like select count(1) from table or something... this doesn't work for indexes of course, but you can always look in the system catalogs, find the filename for the index, then just open() it from an external program

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Pierre-Frdric Caillaud
Reiser4 ? On Thu, 21 Oct 2004 08:58:01 +0100, Matt Clark [EMAIL PROTECTED] wrote: I suppose I'm just idly wondering really. Clearly it's against PG philosophy to build an FS or direct IO management into PG, but now it's so relatively easy to plug filesystems into the main open-source Oses,

Re: [PERFORM] execute cursor fetch

2004-10-12 Thread Pierre-Frdric Caillaud
I just discovered this : http://www.postgresql.org/docs/7.4/static/jdbc-query.html#AEN24298 On Tue, 12 Oct 2004 04:43:43 -0700 (PDT), my ho [EMAIL PROTECTED] wrote: Hi, If anyone can help pls, I have a question abt the execution of cursor create/fetch/move , in particular about disk cost.

Re: [PERFORM] sequential scan on select distinct

2004-10-08 Thread Pierre-Frdric Caillaud
The really tricky part is that a DISTINCT ON needs to know about a first() aggregate. And to make optimal use of indexes, a last() aggregate as well. And ideally the planner/executor needs to know something is magic about first()/last() (and potentially min()/max() at some point) and that

Re: [PERFORM] integer[] indexing.

2004-10-08 Thread Pierre-Frdric Caillaud
disclaimer : brainless proposition (SELECT * FROM table WHERE (icount(ids) = 1 AND ids[1] = 33) UNION ALL (SELECT * FROM table WHERE (icount(ids) 1 AND ids '{33}')); ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] sequential scan on select distinct

2004-10-08 Thread Pierre-Frdric Caillaud
Hashing is at least as fast, if not faster. regards, tom lane Probably quite faster if the dataset is not huge... UniqueSort would be useful for GROUP BY x ORDER BY x though ---(end of broadcast)--- TIP 3: if

Re: [PERFORM] sequential scan on select distinct

2004-10-07 Thread Pierre-Frdric Caillaud
I don't really think it would be a useful plan anyway. What *would* be useful is to support HashAggregate as an implementation alternative for DISTINCT --- currently I believe we only consider that for GROUP BY. The DISTINCT planning code is fairly old and crufty and hasn't been redesigned

Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frdric Caillaud
You could try : explain analyze select land from customer_dim group by land; It will be a lot faster but I can't make it use the index on my machine... Example : create table dummy as (select id, id%255 as number from a large table with 1M rows); so we have a table

Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frdric Caillaud
There are even three questions here : - given that 'SELECT DISTINCT field FROM table' is exactly the same as 'SELECT field FROM table GROUP BY field, postgres could transform the first into the second and avoid itself a (potentially killer) sort. On my example the table was not too

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
pgpool (which makes some rather questionable claims IMO); any decent web application language/environment will support connection pooling. That's why it should not be tied to something specific as pgpool. If you want performance, which is the case here, usually you have a webserver serving

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
1) The materialized data is available in 3 different forms; a list, a detail view, and a spreadsheet. Each form as somewhat different columns and different rules about ordering, which would likely confuse an SQC planner. In this implementation, all 3 forms are able to share the same cache.

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frdric Caillaud
If it was in pgpool or something similar, I could devote a separate machine just for caching results leaving the db server untouched. BUT you would be limited to caching complete queries. There is a more efficient strategy... ---(end of

Re: [PERFORM] index not used when using function

2004-10-03 Thread Pierre-Frdric Caillaud
Maybe add an order by artist to force a groupaggregate ? Hi all, a small question: I've got this table songs and an index on column artist. Since there's about one distinct artist for every 10 rows, it would be nice if it could use this index when counting artists. It doesn't however:

Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread Pierre-Frdric Caillaud
Performance hint : For static data, do not normalize too much. For instance if you have a row which can be linked to several other rows, you can do this : create table parents ( id serial primary key, values... ) create table children ( id serial primary

Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-14 Thread Pierre-Frdric Caillaud
I have a table with ~8 million rows and I am executing a query which should return about ~800,000 rows. The problem is that as soon as I execute the query it absolutely kills my machine and begins swapping for 5 or 6 minutes before it begins returning results. Is postgres trying to load the whole

Re: [PERFORM] Help with extracting large volumes of records across related tables

2004-09-13 Thread Pierre-Frdric Caillaud
There's a very simple solution using cursors. As an example : create table categories ( id serial primary key, name text ); create table items ( id serial primary key, cat_id integer references categories(id), name text ); create index items_cat_idx on items( cat_id ); insert

Re: [PERFORM] Help with extracting large volumes of records across related tables

2004-09-13 Thread Pierre-Frdric Caillaud
Thanks for the thanks ! Generally, when grouping stuff together, it is a good idea to have two sorted lists, and to scan them simultaneously. I have already used this solution several times outside of Postgres, and it worked very well (it was with Berkeley DB and there were 3 lists to

[PERFORM] Question on Byte Sizes

2004-09-10 Thread Pierre-Frdric Caillaud
Hello, * I need information on the size of pg ARRAY[]'s : I did not find any info in the Docs on this. How many bytes does an array take on disk ? Is there a difference between an array of fixed size elements like integers, and an array of variable length elements like text ? is there a

Re: [PERFORM] The usual sequential scan, but with LIMIT !

2004-09-07 Thread Pierre-Frdric Caillaud
OK, thanks a lot for your explanations. Knowing how the planner thinks, makes it pretty logical. Thank you. Now another question... I have a table of records representing forum posts with a primary key (id), a topic_id, a timestamp, and other fields which I won't detail. I want to

Re: [PERFORM] The usual sequential scan, but with LIMIT !

2004-09-07 Thread Pierre-Frdric Caillaud
Yes, you're right as usual. I had not thought about playing with ORDER BY on a field which has only one value in the result set. If you write it as SELECT WHERE topic_id=2 ORDER BY topic_id DESC,id DESC. then an index on (topic_id, id) will work fine. The mixed ASC/DESC ordering is

[PERFORM] The usual sequential scan, but with LIMIT !

2004-09-06 Thread Pierre-Frdric Caillaud
Hello, I have this table : CREATE TABLE apparts ( id SERIAL NOT NULL PRIMARY KEY, priceFLOAT NOT NULL, surfaceINTEGER NOT NULL, price_sq FLOAT NOT NULL, roomsINTEGER NULL, vente

Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Pierre-Frdric Caillaud
Were you upset by my message ? I'll try to clarify. I understood from your email that you are a Windows haters Well, no, not really. I use Windows everyday and it has its strengths. I still don't think the average (non-geek) person can really use Linux as a Desktop OS. The problem I have

Re: [PERFORM] fsync vs open_sync

2004-09-03 Thread Pierre-Frdric Caillaud
There is also the fact that NTFS is a very slow filesystem, and Linux is a lot better than Windows for everything disk, caching and IO related. Try to copy some files in NTFS and in ReiserFS... I'm not so sure I would agree with such a blanket generalization. I find NTFS to be very fast,

Re: [PERFORM] Why does a simple query not use an obvious index?

2004-08-30 Thread Pierre-Frdric Caillaud
Another primary key trick : If you insert records with a serial primary key, and rarely delete them or update the timestamp, you can use the primary key to compute an approximate number of rows. a := SELECT pkey FROM table WHERE timestamp() threshold ORDER BY timestamp ASC LIMIT 1;

Re: [PERFORM] seqscan instead of index scan

2004-08-30 Thread Pierre-Frdric Caillaud
create index t_idx on t((c+d)); select * from t where c+d 0; Why not : select ((select * from t where c0::bigint) UNION (select * from t where d0::bigint)) group by whatever; or someting ? ---(end of broadcast)--- TIP 3: if

Re: [PERFORM] Postgres does not utilize indexes. Why?

2004-08-17 Thread Pierre-Frdric Caillaud
test where id = 5; Few times I added 100,000 records, applied cast the 5 to int8 and it will use the index ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes

[PERFORM] Reiser4

2004-08-13 Thread Pierre-Frdric Caillaud
ReiserFS 4 is (will be) a filesystem that implements transactions. Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of

Re: [PERFORM] Hardware upgrade for a high-traffic database

2004-08-11 Thread Pierre-Frdric Caillaud
We're currently running Postgres 7.4.1 on FreeBSD 5.2, a Dual Xeon 2.4, 2GB ECC, 3Ware Serial ATA RAID 5 w/ 4 disks (SLOW!!). Cheap solution while you look for another server : Try to use something other than RAID5. You have 4 disks, so you could use a striping+mirroring RAID which would

Re: [PERFORM] NUMERIC x VARCHAR

2004-08-11 Thread Pierre-Frdric Caillaud
Numeric won't store that : (+33) 4 01 23 45 67 On Wed, 11 Aug 2004 02:42:33 -0300, Er Galvão Abbott [EMAIL PROTECTED] wrote: Greetings. I have a question regarding performance of certain datatypes: I have a field where I will store my clients phone numbers. I know that this field will

Re: [PERFORM] Performance Bottleneck

2004-08-05 Thread Pierre-Frdric Caillaud
Apache processes running for 30 minutes ?. My advice : use frames and Javascript ! In your webpage, you have two frames : content and refresh. content starts empty (say, just a title on top of the page). refresh is refreshed every five seconds from a script on your

Re: [PERFORM] Performance Bottleneck

2004-08-04 Thread Pierre-Frdric Caillaud
The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb

Re: [PERFORM] Tuning queries on large database

2004-08-04 Thread Pierre-Frdric Caillaud
not so bad for oracle. What about for PG ? How data is stored I agree with the datatype issue. Smallint, bigint, integer... add a constraint... Also the way order of the records in the database is very important. As you seem to have a very large static population in your table, you should

Re: [PERFORM] Tuning queries on large database

2004-08-04 Thread Pierre-Frdric Caillaud
You often make sums. Why not use separate tables to cache these sums by month, by poste, by whatever ? Rule on insert on the big table updates the cache tables. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose

Re: [PERFORM] arrays and indexes

2004-07-26 Thread Pierre-Frdric Caillaud
SELECT * from content where 42 = ANY (authors); Postgres does have a way to do what you ask, though. It involves GiST indexes and the operators from the contrib/intarray directory from the Postgres source. I have tried to use these indexes, and the performance was very good. It can be