Re: [PERFORM] How are text columns stored?

2005-06-28 Thread Tom Lane
Meetesh Karia [EMAIL PROTECTED] writes: According to section 8.3 of the doc: Long values are also stored in background tables so they do not interfere with rapid access to the shorter column values. So, how long does a value have to be to be considered long? Several kilobytes.

Re: [PERFORM] perl garbage collector

2005-06-28 Thread Tom Lane
Jean-Max Reymond [EMAIL PROTECTED] writes: I have a stored procedure written in perl and I doubt that perl's garbage collector is working :-( after a lot of work, postmaster has a size of 1100 Mb and I think that the keyword undef has no effects. Check the PG list archives --- there's been

Re: [PERFORM] How can I speed up this function?

2005-06-28 Thread David Mitchell
The function I have exits the loop when the count hits 100 yes, but the inner loop can push the count up as high as necessary to select all the statements for a transaction, so by the time it exits, the count could be much higher. I do want to limit the statements, but I want to get enough for

Re: [PERFORM] How can I speed up this function?

2005-06-28 Thread Gnanavel Shanmugam
But in the function you are exiting the loop when the count hits 100. If you do not want to limit the statements then remove the limit clause from the query I've written. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 16:29:32 +1200 To:

Re: [PERFORM] Too slow querying a table of 15 million records

2005-06-28 Thread John A Meinel
Tobias Brox wrote: [EMAIL PROTECTED] - Tue at 08:33:58PM +0200] I use FreeBSD 4.11 with PostGreSQL 7.3.8. (...) database= explain select date_trunc('hour', time),count(*) as total from test where p1=53 and time now() - interval '24 hours' group by date_trunc order by date_trunc ;

Re: [PERFORM] Postgresql7.4.5 running slow on plpgsql function

2005-06-28 Thread Michael Fuhr
On Thu, Jun 23, 2005 at 05:56:52PM +0800, Chun Yit(Chronos) wrote: currently we have a function that use together with temp table, it calls search result function, everytime this function is calling, it will go through some filter before come out as a result. now we have some major problem ,

Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Praveen Raja
I assume you took size to mean the row size? What I really meant was does the number of rows a table has affect the performance of new inserts into the table (just INSERTs) all other things remaining constant. Sorry for the confusion. I know that having indexes on the table adds an overhead but

Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Jacques Caron
Hi, At 11:50 28/06/2005, Praveen Raja wrote: I assume you took size to mean the row size? Nope, the size of the table. What I really meant was does the number of rows a table has affect the performance of new inserts into the table (just INSERTs) all other things remaining constant. Sorry

Re: [PERFORM] Too slow querying a table of 15 million records

2005-06-28 Thread Christopher Kings-Lynne
database= explain select date_trunc('hour', time),count(*) as total from test where p1=53 and time now() - interval '24 hours' group by date_trunc order by date_trunc ; Try going: time '2005-06-28 15:34:00' ie. put in the time 24 hours ago as a literal constant. Chris

[PERFORM]

2005-06-28 Thread Erik Westland
Yahoo! Sports Rekindle the Rivalries. Sign up for Fantasy Football http://football.fantasysports.yahoo.com ---(end of broadcast)--- TIP 8: explain analyze is your friend

[PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. In other words, if an 'id' column has values 1,2,3,4,6 and 7, I need a query that returns the value of 5. I've already worked out a

Re: [PERFORM] tricky query

2005-06-28 Thread Bruno Wolff III
On Tue, Jun 28, 2005 at 10:21:16 -0400, Merlin Moncure [EMAIL PROTECTED] wrote: I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. In other words, if an 'id' column has values

Re: [PERFORM] tricky query

2005-06-28 Thread John A Meinel
Merlin Moncure wrote: I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. In other words, if an 'id' column has values 1,2,3,4,6 and 7, I need a query that returns the value of 5.

Réf. : [PERFORM] tricky query

2005-06-28 Thread bsimon
I would suggest something like this, don't know how fast it is ... : SELECT (ID +1) as result FROM my_table WHERE (ID+1) NOT IN (SELECT ID FROM my_table) as tmp ORDER BY result asc limit 1; Merlin Moncure [EMAIL PROTECTED] Envoyé par : [EMAIL PROTECTED] 28/06/2005 16:21 Pour :

Re: [PERFORM] Insert performance vs Table size

2005-06-28 Thread Tom Lane
Praveen Raja [EMAIL PROTECTED] writes: I know that having indexes on the table adds an overhead but again does this overhead increase (for an INSERT operation) with the number of rows the table contains? Typical index implementations (such as b-tree) have roughly O(log N) cost to insert or

Re: [PERFORM] tricky query

2005-06-28 Thread John A Meinel
Merlin Moncure wrote: Not so bad. Try something like this: SELECT min(id+1) as id_new FROM table WHERE (id+1) NOT IN (SELECT id FROM table); Now, this requires probably a sequential scan, but I'm not sure how you can get around that. Maybe if you got trickier and did some ordering

Re: [PERFORM] tricky query

2005-06-28 Thread Sam Mason
Merlin Moncure wrote: I've already worked out a query using generate_series (not scalable) and pl/pgsql. An SQL only solution would be preferred, am I missing something obvious? I would be tempted to join the table to itself like: SELECT id+1 FROM foo WHERE id 0 AND i NOT IN (SELECT

Re: [PERFORM] tricky query

2005-06-28 Thread John A Meinel
John A Meinel wrote: Merlin Moncure wrote: I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. In other words, if an 'id' column has values 1,2,3,4,6 and 7, I need a query that

Re: [PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
Not so bad. Try something like this: SELECT min(id+1) as id_new FROM table WHERE (id+1) NOT IN (SELECT id FROM table); Now, this requires probably a sequential scan, but I'm not sure how you can get around that. Maybe if you got trickier and did some ordering and limits. The above

Re: [PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
John Meinel wrote: See my follow up post, which enables an index scan. On my system with 90k rows, it takes no apparent time. (0.000ms) John =:- Confirmed. Hats off to you, the above some really wicked querying. IIRC I posted the same question several months ago with no response and had

Re: [PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
Merlin Moncure wrote: I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. I've already worked out a query using generate_series (not scalable) and pl/pgsql. An SQL only

[PERFORM] read block size

2005-06-28 Thread Michael Stone
Is it possible to tweak the size of a block that postgres tries to read when doing a sequential scan? It looks like it reads in fairly small blocks, and I'd expect a fairly significant boost in i/o performance when doing a large (multi-gig) sequential scan if larger blocks were used. Mike Stone

Re: [PERFORM] tricky query

2005-06-28 Thread Cosimo Streppone
Merlin Moncure wrote: I need a fast way (sql only preferred) to solve the following problem: I need the smallest integer that is greater than zero that is not in the column of a table. I've already worked out a query using generate_series (not scalable) and pl/pgsql. An SQL only solution

Re: [PERFORM] Too slow querying a table of 15 million records

2005-06-28 Thread PFC
database= explain select date_trunc('hour', time),count(*) as total from test where p1=53 and time now() - interval '24 hours' group by date_trunc order by date_trunc ; 1. Use CURRENT_TIMESTAMP (which is considered a constant by the planner) instead of now() 2. Create a multicolumn

Re: [PERFORM] read block size

2005-06-28 Thread Michael Stone
On Tue, Jun 28, 2005 at 12:02:55PM -0500, John A Meinel wrote: There has been discussion about changing the reading/writing code to be able to handle multiple pages at once, (using something like vread()) but I don't know that it has been implemented. that sounds promising Also, this would

Re: [PERFORM] tricky query

2005-06-28 Thread John A Meinel
Merlin Moncure wrote: John Meinel wrote: See my follow up post, which enables an index scan. On my system with 90k rows, it takes no apparent time. (0.000ms) John =:- Confirmed. Hats off to you, the above some really wicked querying. IIRC I posted the same question several months ago

Re: [PERFORM] Too slow querying a table of 15 million records

2005-06-28 Thread Tom Lane
PFC [EMAIL PROTECTED] writes: 1. Use CURRENT_TIMESTAMP (which is considered a constant by the planner) instead of now() Oh? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once

Re: [PERFORM] tricky query

2005-06-28 Thread Sam Mason
John A Meinel wrote: SELECT t1.id+1 as id_new FROM id_test t1 WHERE NOT EXISTS (SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1) ORDER BY t1.id LIMIT 1; This works well on sparse data, as it only requires as many index access as it takes to find the first gap. The simpler NOT IN

Re: [PERFORM] perl garbage collector

2005-06-28 Thread Jean-Max Reymond
2005/6/28, Tom Lane [EMAIL PROTECTED]: Jean-Max Reymond [EMAIL PROTECTED] writes: I have a stored procedure written in perl and I doubt that perl's garbage collector is working :-( after a lot of work, postmaster has a size of 1100 Mb and I think that the keyword undef has no effects.

Re: [PERFORM] tricky query

2005-06-28 Thread John A Meinel
Merlin Moncure wrote: On Tue, Jun 28, 2005 at 12:02:09 -0400, Merlin Moncure [EMAIL PROTECTED] wrote: Confirmed. Hats off to you, the above some really wicked querying. IIRC I posted the same question several months ago with no response and had given up on it. I think your solution

Re: [PERFORM] tricky query

2005-06-28 Thread Cosimo Streppone
John A Meinel wrote: John A Meinel wrote: Merlin Moncure wrote: I need the smallest integer that is greater than zero that is not in the column of a table. In other words, if an 'id' column has values 1,2,3,4,6 and 7, I need a query that returns the value of 5. [...] Well, I was able

Re: [PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
On Tue, Jun 28, 2005 at 12:02:09 -0400, Merlin Moncure [EMAIL PROTECTED] wrote: Confirmed. Hats off to you, the above some really wicked querying. IIRC I posted the same question several months ago with no response and had given up on it. I think your solution (smallest X1 not in X)

Re: [PERFORM] tricky query

2005-06-28 Thread Bruno Wolff III
On Tue, Jun 28, 2005 at 12:02:09 -0400, Merlin Moncure [EMAIL PROTECTED] wrote: Confirmed. Hats off to you, the above some really wicked querying. IIRC I posted the same question several months ago with no response and had given up on it. I think your solution (smallest X1 not in X) is a

Re: [PERFORM] tricky query

2005-06-28 Thread Merlin Moncure
Cosimo wrote: I'm very interested in this tricky query. Sorry John, but if I populate the `id_test' relation with only 4 tuples with id values (10, 11, 12, 13), the result of this query is: cosimo= create table id_test (id integer primary key); NOTICE: CREATE TABLE / PRIMARY KEY

[PERFORM] optimized counting of web statistics

2005-06-28 Thread Billy extyeightysix
Hola folks, I have a web statistics Pg database (user agent, urls, referrer, etc) that is part of an online web survey system. All of the data derived from analyzing web server logs is stored in one large table with each record representing an analyzed webserver log entry. Currently all reports

Re: [PERFORM] tricky query

2005-06-28 Thread Sebastian Hennebrueder
John A Meinel schrieb: John A Meinel wrote: Well, I was able to improve it to using appropriate index scans. Here is the query: SELECT t1.id+1 as id_new FROM id_test t1 WHERE NOT EXISTS (SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1) ORDER BY t1.id LIMIT 1; I

Re: [PERFORM] optimized counting of web statistics

2005-06-28 Thread Billy extyeightysix
The bottleneck in the whole process is actually counting each data point (how many times a url was visited, or how many times a url referred the user to the website). So more specifically I am wondering if there is way to store and retrieve the data such that it speeds up the counting of the

Re: [PERFORM] optimized counting of web statistics

2005-06-28 Thread Matthew Nuzum
On 6/28/05, Billy extyeightysix [EMAIL PROTECTED] wrote: Hola folks, I have a web statistics Pg database (user agent, urls, referrer, etc) that is part of an online web survey system. All of the data derived from analyzing web server logs is stored in one large table with each record

Re: [PERFORM] perl garbage collector

2005-06-28 Thread Jean-Max Reymond
2005/6/28, Jean-Max Reymond [EMAIL PROTECTED]: For my application (in real life) afer millions of spi_exec_query, it grows up to 1Gb :-( OK, now in 2 lines: CREATE FUNCTION jmax() RETURNS integer AS $_$use strict; for (my $i=0; $i1000;$i++) { spi_exec_query(select 'foo'); } my

Re: [PERFORM] optimized counting of web statistics

2005-06-28 Thread Rudi Starcevic
Hi, I do my batch processing daily using a python script I've written. I found that trying to do it with pl/pgsql took more than 24 hours to process 24 hours worth of logs. I then used C# and in memory hash tables to drop the time to 2 hours, but I couldn't get mono installed on some of my older

Re: [PERFORM] optimized counting of web statistics

2005-06-28 Thread Matthew Nuzum
On 6/29/05, Rudi Starcevic [EMAIL PROTECTED] wrote: Hi, I do my batch processing daily using a python script I've written. I found that trying to do it with pl/pgsql took more than 24 hours to process 24 hours worth of logs. I then used C# and in memory hash tables to drop the time to 2