[GENERAL] Interval ordering
Hello, I have a table with a list of times. When the user provides my application with a desired time, I want to show them the 5 times from the table that are closest to their input. I expected to do this using abs() like such: select mytime from mytable order by abs(usertime-mytime) asc limit 5; However, the difference between times is an interval, and there appears to be no absolute value operator for those. My next thought was to convert the interval into integer like such: select mytime from mytable order by abs((usertime-mytime) / interval '1 minute') asc limit 5; However. there is no operator for interval division either. The best solution I've come up with is to use a case statement. select mytime from mytable order by case when (usertime-mytime) interval '0' then (mytime-usertime) else (usertime-mytime) end asc limit 5; Is this ugly query really necessary for postgres?
Re: [GENERAL] Interval ordering
How about something like: test(5432)aklaver=SELECT ts_fld2,now()-ts_fld2 from timestamp_test order by now()-ts_fld2 limit 5; Thanks Adrian, Let me explain the problem better. Say my table has 24 entries, one for each hour, midnight through 11 pm. If the user enters 6:30 PM, I want to give them the closest times in proximity - both before AND after - to their input... so in this case, I'd return 5:00 PM, 6:00 PM, 7:00 PM, 8:00 PM, etc. I believe your solution only provides the closest times BEFORE and not AFTER. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Interval ordering
try this: select mytime from mytable order by abs(extract(epoch from (usertime-mytime))) asc limit 5; SELECT ts_fld2,abs(extract(epoch from '2011-03-25 14:15:25-07'::timestamptz)-extract(epoch from ts_fld2)) from timestamp_test order by abs(extract(epoch from '2011-03-25 14:15:25-07'::timestamptz)-extract(epoch from ts_fld2)) limit 5; Thanks to Ondrej and Adrian, who both provided working solutions using extract/epoch. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] calling a function over several rows
Hello, There is an existing function which takes an integer and returns a record. I need to call this function with every integer in a table. Is there a simple shortcut for doing this? I'm looking for something like: select f.* from function(t.value) f, table t Thanks, Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] calling a function over several rows
Merlin Moncure wrote: On Tue, Nov 17, 2009 at 1:02 AM, Adam Rich ada...@sbcglobal.net wrote: Hello, There is an existing function which takes an integer and returns a record. I need to call this function with every integer in a table. Is there a simple shortcut for doing this? I'm looking for something like: select f.* from function(t.value) f, table t select (f).* from (select function(t.value) as f from table t) q; merlin Thanks, that's perfect, and much faster than the one I came up with in the interim: select (f(t.value)).* from table t; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Preventing database listing?
This seems like a simple question that would have come up, but I'm not able to find an answer in google, PG docs, or PG mailing list archives. How do I prevent a user from being able to list all databases in my cluster? I want to restrict them to seeing just the databases they have connect rights to. Thanks Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Understanding sort's memory/disk usage
Hello, Please reference these explain plans. This is Pg 8.4.1 http://explain-analyze.info/query_plans/4032-query-plan-2745 http://explain-analyze.info/query_plans/4033-query-plan-2746 First, could somebody explain what is leading the first query to choose a different plan that's much slower? In the first plan only, this expression is in the select group by: s.store_num || ' - ' || s.title These are both non-null varchar fields. Both have a unique index. Second, why would it choose to sort on disk for what appears to be ~32MB of data, when my work_mem and temp_buffers are both 64 MB each? If I increase work_mem and temp_buffers to 128 MB, I get a faster plan: http://explain-analyze.info/query_plans/4034-query-plan-2747 But it's only reporting 92kb of memory used? Why don't I see numbers between 64 MB and 128 MB for both the on-disk and in-memory plans? Thanks, Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] More straight forward method to convert seconds::bigint to interval
Shane R. Spencer wrote: I work in VoIP. HMS (Hour/Minute/Second) format appears to be the rule when working with call time totals. I admit it makes some reports easier to read. The method I used to convert a int/bigint to HMS (or the standard representation of an interval type) is as follows: select (123456.789::varchar(24) || ' seconds')::interval as HMS; hms -- 34:17:36.789 Is there a less string oriented method of converting seconds as an int to an interval? - Shane I think this is cleaner/faster: select interval '1 second' * 123456.789 as HMS; hms -- 34:17:36.789 -Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multiple counts on criteria - Approach to a problem
Neil Saunders wrote: Hi all, I maintain an online property rental application. The main focus of the UI is the search engine, which I'd now like to improve by allowing filtering of the search results shown on some criteria, but provide a count of the number of properties that meet that criteria. (snip) ...and so on. My question is simple - What's the best way to implement this - Do I literally have to execute a count for the WHERE criteria with the filter criteria tagged on, or is there some clever trick that I'm not aware of? I'd rather not count in the application as I'd like to plan for the day we have up to 100k properties ( Any suggestions gratefully received! Here's the structure you want: select sum(case bedrooms when 1 then 1 else 0 end) as br1, sum(case bedrooms when 2 then 1 else 0 end) as br2, sum(case bedrooms when 3 then 1 else 0 end) as br3, sum(case has_bbq when 1 then 1 else 0 end) as bbq, sum(case has_pool when 1 then 1 else 0 end) as pool from properties in other words, you can put the criteria inside a case statement that returns a 0 or 1, and use sum() over that case to count the rows that returned a 1. Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Audit Trigger puzzler
David Kerr wrote: On Wed, Sep 02, 2009 at 11:44:20PM -0500, Adam Rich wrote: - In Oracle, the way we handle audit triggers is by using Package - Variables. We emulate some of that functionality in postgresql by - adding a custom variable to the configuration file: - - custom_variable_classes = 'mysess' - - - In your trigger, you could check that this variable was unset, and fall - back to the database user. - Thanks! that does seem slick, but will it work with connection pooling? Dave I don't see why it wouldn't work, as long as you set reset_query_list properly, and set the session variable the the logged in user whenever you grab a connection from the pool. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] auto-increment in a view
Willy-Bas Loos wrote: Hi, I'm trying to figure out to generate a auto-increment column in a view. There is no physical column to base it on, the view contains a group by clause, which renders that impossible. In a normal query i can create a sequence for that purpouse and drop it afterwards, but apart form it being ugly, it's impossible in a view. Another possibility is to crate a function and call that function from the view. It works, but the function is not transparent, like the view is. Meaning: the function will execute the whole query, gather the results, and when i only need a subset, it will just forget about the surplus. Isnt't there a decent way to add an incrementing value to a view? Cheers, WBL Sounds like you need ROWNUM which is easy to do with windowing functions in 8.4, but on 8.3 you'll need a hack like this: http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ Then you could define your view as: create or replace view testview as select rownum(), value from test group by value; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] New server disk setup?
Hello, I'm building a new server on RHEL 5.3 and was wondering if there was an optimized build guide published somewhere with guidelines on disk partitioning, filesystems, etc? For example, do you recommend putting the data on an ext2 partition mounted noatime, and the logs on ext3? Or should I just use XFS for the whole thing? Are there any other brand-new server choices to consider now? I did find some discussions in the mailing archives around filesystems, but there was mostly conflicting information. And I realize that a lot of the drawbacks that were considerations in the past may have been resolved by now. It would be great if one could view the developer-recommended configuration for a particular operating system on the postgresql.org website. I found a few guides on the wiki, but they are either 3+ years old, or they don't focus on server-setup as a whole. Thanks, Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Audit Trigger puzzler
Most of the time, my application will set the edited_by field to reflect an application username (i.e., the application logs into the database as a database user, and that's not going to be the application user) So I log into my application as Dave, but the application connects to the database as dbuser. If the app doesn't specifically send an edited_by value in it's update, then I want to default that value to the database user. This would also be good for auditing any manual data changes that could happen at the psql level. In Oracle, the way we handle audit triggers is by using Package Variables. We emulate some of that functionality in postgresql by adding a custom variable to the configuration file: custom_variable_classes = 'mysess' Then, whenever a user logs into the application, my login procedure calls this function: CREATE OR REPLACE FUNCTION begin_sess(staffid character varying) RETURNS void AS $BODY$ BEGIN PERFORM set_config('mysess.curr_user', coalesce(staffid,''), false); END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; This makes the current application user automatically available to every function, including triggers. Then, in your triggers, you can do this: DECLARE curr_user staff.staff_id%TYPE; BEGIN SELECT current_setting('mysess.curr_user') INTO curr_user; In your trigger, you could check that this variable was unset, and fall back to the database user. HTH. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: R: [GENERAL] Field's position in Table
From: vinny vi...@xs4all.nl Subject: Re: R: [GENERAL] Field's position in Table To: Sam Mason s...@samason.me.uk Cc: pgsql-general@postgresql.org Date: Monday, August 24, 2009, 2:38 PM On Thu, 2009-08-20 at 12:16 +0100, Sam Mason wrote: On Thu, Aug 20, 2009 at 11:24:49AM +0200, vinny wrote: I can't really think of any real reason to put the field at a particular position, applications don't reallty care about the order of fields. Because it's very convenient for ad-hoc queries! PG currently assumes that the column order is the same as when it was created but there are (unimplemented) suggestions about how to fix this. See for example: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php -- Sam http://samason.me.uk/ But how is it convenient exactly, is it just a timesaver so you can SELECT * instead of having to type SELECT firstname, lastname, email? For me, saying all new fields must go at the end of the table is like saying all new functions must go at the end of your C source file. Not that it makes *any* difference to the end user, or other applications using your libraries, but as developers we tend to be more organized than the general public. Most programmers habitually organize their source code to keep related functions together. It seems sloppy to have 10 memory-related functions together in the source, and then an 11th hidden 6 pages down in the middle of file-related functions. And if you're writing OO code in C++ or Java, you even group private variables and methods separately from public ones. Most of the people who advocate tacking new fields at the end of a table would never dream of following this convention for source code. So when I'm working in PgAdmin, I like to see my primary foreign keys listed first, then data fields in logical groupings, and finally the standard footer fields we add to all tables like create update by/date. Whenever I'm developing and need to reference a table definition, (or do a select * in pgAdmin for sample data) I lose productivity having to scan through all the fields repeatedly instead of seeing at a glance the fields I want because I know where they *should* be in the listing. Sometimes I have to scan through the fields several times before I finally see the one I want, because it was in the middle of unrelated items. I *never* code my applications to depend on field order; I'm referring to development convenience only. (Just my two cents, YMMV, etc)
Re: [GENERAL] how to return field based on field= NULL or not
Juan Backson wrote: Hi, In my table, I have fieldA and fieldB. At any point in time, only one of these fields can have data in it. The other is NULL. Instead of select fieldA, fieldB from table, I want it to return either fieldA or fieldB depends on whether it is NULL or not. The reason is because I want to use select array_to_string(array_accum(field A or field B) ,',') from table. Is it possible to do it that way? Thanks, JB The two main ways of doing this are COALESCE(fieldA, fieldB) http://www.postgresql.org/docs/8.3/interactive/functions-conditional.html#AEN14484 and CASE WHEN fieldA IS NULL THEN fieldB ELSE fieldA END; http://www.postgresql.org/docs/8.3/interactive/functions-conditional.html#AEN14434 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgre RAISE NOTICE and PHP
Andre, See this PHP page: http://www.php.net/manual/en/function.pg-last-notice.php Andre Lopes wrote: Hi, I'm developing a function with some checks, for example... to check if the e-mail is valid or not. If the e-mail is not valid I put a line with RAISE NOTICE 'E-mail not valid'. I need to know if it is possible to show this RAISE NOTICE when I run this function from PHP. Best Regards, André. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Tim Keitt wrote: I am combining query results that I know are disjoint. I'm wondering how much overhead there is in calling union versus union all. (Just curious really; I can't see a reason not to use union all.) (cc me please; not subscribed...) THK I think you can test this one yourself pretty easily. Just run the two queries with explain analyze. Union All should run in about the sum of the separate queries. Plain Union will always be slower, because it takes the same results from union all and runs them through an extra sort/distinct or hash step. In my tests, on a query with 600,000 rows, the Plain Union took about 3x as long to complete. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Date math
Hello, I have a table with a DATE field birth_date. The data obviously contains various dates in the past, such as 07/04/1970. In my query, I need to retrieve the person's next birthday. In other words, for the example date 07/04/1970, the query should return 07/04/2009 for the current week, but after this July 4th, it would return 07/04/2010. Ultimately, I need to find people with next birthdays within a certain range. The best I've come up with so far is: select case when to_char(birth_date, 'MMDD') = to_char(current_date, 'MMDD') then (to_char(birth_date, 'MM/DD/')||date_part('year', now())+1)::date else (to_char(birth_date, 'MM/DD/')||to_char(current_date,''))::date end as next_birthday from people inner join openings on people.id=openings.id where case when to_char(birth_date, 'MMDD') = to_char(current_date, 'MMDD') then (to_char(birth_date, 'MM/DD/')||date_part('year', now())+1)::date else (to_char(birth_date, 'MM/DD/')||to_char(current_date,''))::date end between openings.item_date - interval '1 month' and openings.item_date + interval '1 month' This seems to work for most cases, but fails for Feb 29 birthdates. And converting dates to strings and back again seems like a hack... Is there a better way? (I prefer to treat 02/29 as 03/01 for non-leap years) Is there a way to add just enough years to birth_date to bring the result into the future? Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: Fwd: [GENERAL] Date math
Guy Flaherty wrote: On Sun, Jun 28, 2009 at 1:52 PM, Adam Rich ada...@sbcglobal.net mailto:ada...@sbcglobal.net wrote: Hello, I have a table with a DATE field birth_date. The data obviously contains various dates in the past, such as 07/04/1970. In my query, I need to retrieve the person's next birthday. In other words, for the example date 07/04/1970, the query should return 07/04/2009 for the current week, but after this July 4th, it would return 07/04/2010. Ultimately, I need to find people with next birthdays within a certain range. You could use the extract() function to calculate the day of year of the person's birthdate and then check if this number is within today's day of year and range of days you want to check for, for example, today's day of year + 30 days to be within a month. That way you don't need to worry about years at all. You may need to double check this will work on the leap years though! Thanks! that's even better than what I just came up with: birth_date + ((interval '1 year') * ceil(EXTRACT(DAYS FROM (now() - birth_date))/365.25)) And I like the Day of year solution because (I think) I can use a functional index on that value. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question on Foreign Key Structure/Design
APseudoUtopia wrote: Hey list, I have a table with user IDs, among other information. I also have a table of comments that users can place on a page. CREATE TABLE users ( id SERIAL PRIMARY KEY, ... ); CREATE TABLE comments ( id SERIAL PRIMARY KEY, userid INTEGER REFERENCES users (id) ON DELETE RESTRICT, . ); I'm new to the use of foreign keys and this is the first design I've created with them being implemented. I have a couple questions on the setup. 1. Do I need NOT NULL in the comments(userid) column? users(id) is automatically NOT NULL due to the primary key, but I'm not sure if comments(userid) needs NOT NULL as well, or if the foreign key will automatically transfer that property over. If comments must always be associated with a user, you should add the NOT NULL, to enforce that. You probably also want an index on that column. 2. I do not want to get rid of any comments, even if the user is deleted (on the application level, I'd display something like UnknownUser or UnknownUser#1234). Right now, I just have it ON DELETE RESTRICT, but that obviously prevents any users who have commented from being deleted. How do the more-experienced database admins suggest I do in this case? Should I set a DEFAULT of 0 on the comments, then use ON DELETE SET DEFAULT? Then, on the application level when 0 is found, it displays UknownUser? Or, should I just remove the foreign key completely, and on the application level if the JOIN for the username returns empty/NULL, display UknownUser#1234? You can't have an ID of 0 with a foreign key in place, unless there is a user with that ID. But rather than use 0, you should use NULL for that (NULL means unknown whereas 0 should always have meaning). The foreign key will still accept NULL. When you add the foreign key, use the ON DELETE SET NULL clause for this purpose (instead of RESTRICT). Otherwise, the table structure you list above looks fine. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] limit-offset different result sets with same query
Emanuel Calvo Franco wrote: Executing 'select * from datos limit 1 offset 15' two times i have different result sets. When i execute 'explain analyze verbose query' i see that (as expected) the seq scan is occurring. That's correct? Is logical that if the scan is sequential in the physical table returns differents data? I test it on 8.4 beta1 and 8.3.5 and 8.3.7 with the same results. Thanks in advance. Emanuel, LIMIT and OFFSET are stable only when you have ORDER BY on unique values. Without that, the database is free to return the rows in whatever order it deems best, which gives unpredictable results when combined with LIMIT/OFFSET. Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UPDATE... FROM - will ORDER BY not respected?
Carlo Stonebanks wrote: (FOR PG VERSION 8.3.6 running on Windows 2003 Server x64) We have a function that assigns unique ID's (to use as row identifiers) to a table via an UPDATE using nextval(). This table is imported from another source, and there is a sequencing field to let the query know in which order to assign the row identifiers. (Please do not confuse the sequencing field with a sequence value from nextval()) The UPDATE command gets the order of the rows to update using a FROM clause, which in turn reads from a sub-query to get the rows in the order of seq. The problem is that the UPDATE is NOT behaving as if it is receiving the sequence identifiers in the order specified. In fact, it appears it is returned in REVERSE order (assigning id's in reverse order based on the values in seq) Here is the essence of the query (further below you will find the full DDL code of the function). UPDATE impt_table SET id = nextval(''id_seq'') FROM (SELECT seq FROM impt_table WHERE id IS NULL ORDER BY seq ) AS empty_ids WHERE impt_table.seq = empty_ids.seq AND impt_table.id IS NULL; Was I wrong in assuming that the UPDATE would respect the order of rows coming out of the sub-clause? Is there a better way to do this? Thanks, Carlo I think the ORDER BY is free to update the rows in any order it needs to. The key is to put the sequence further down. How about this? UPDATE impt_table SET id = newid FROM SELECT seq, nextval('id_seq') as newid FROM (SELECT seq FROM impt_table WHERE id IS NULL ORDER BY seq ) AS pre_empty_ids ) as empty_ids WHERE impt_table.seq = empty_ids.seq AND impt_table.id IS NULL; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UPDATE... FROM - will ORDER BY not respected?
Carlo Stonebanks wrote: I think the ORDER BY is free to update the rows in any order it needs to. The key is to put the sequence further down. How about this? Adam - thanks. Unless I hear otherwise I will assume that you mean the UPDATE is free to update the rows in any way it wants - irregardless of how whether the data return in the FROM clause is ordered. Whoops - you're right, that's what I meant. Even though one of the tables is sorted, when it joins the two tables, the planner might decide it's cheaper to update impt_table by iteratively looking up values from the sorted table that match the (unordered) rows from impt_table, or by hashing the keys which I think also results in unsorted updates. You can use explain to see the details. Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] After Upgrade from 8.2.6 to 8.3.6: function to_timestamp does not exist
OK, so you want to see if a timestamp is greater than now()? Why not just compare them? where a.from_datetime = now() No, not the whole timestamp. I dont want to check the time. So I had to truncate the datetime with: date_trunc('day', a.from_datetime) = date_trunc('day', NOW()) If you're going to truncate the NOW(), just go with CURRENT_DATE instead. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] After Upgrade from 8.2.6 to 8.3.6: function to_timestamp does not exist
This query makes little sense. Why are you trying to convert a timestamp to a timestamp? Is this a bizarre substitute for date_trunc()? The from_datetime column is of type timestamp but I want to check only the date, not the time. In this example I want to retrieve all records whose from_datetime is e.g. = 2009/05/06 (Now()) so I'd like to get results with a from_datetime like e.g. - 2009/05/06 00:05:00 - 2009/05/06 23:30:00 - 2009/05/07 10:15:00 Regards Nico I use something like this: Where from_datetime::date = current_date Or Where date_trunc('day', from_datetime) = current_date (current_date is like now() except it's a date instead of timestamp) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query question
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Sharma, Sid Sent: Tuesday, February 24, 2009 12:47 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Query question Hi, I am new to Postgres and am trying to write a query like the one below (without any luck) SELECT col_speed, col_time, (col_speed / col_time) AS distance FROM speed_ratings HAVING distance ? ORDER BY distance In other words, I want to filter on a calculated column. But I get an error that column distance is not defined column distance does not exist at character 272 Interestingly if I remove the filter (HAVING distance ?), the query works. So I can sort on distance but not filter. I have tried substituting the HAVING clause with a WHERE clause as well with no luck. I have also added a GROUP BY clause with the HAVING as well with no luck. Any ideas? Thanks Sid You were on the right track, unfortunately the rules are not very Consistent regarding when aliases can or cannot be used. In this case, WHERE and HAVING cannot use an alias, but ORDER BY and most others require it. Also, HAVING is applied to aggregate functions (like min/max/average) Try your query in this form: SELECT col_speed, col_time, (col_speed / col_time) AS distance FROM speed_ratings WHERE (col_speed / col_time) ? ORDER BY dd If you want to use GROUP BY / HAVING, you need to use another field to group the results by, as well as the aggregate function. for example, if you had a type_id field and wanted the maximum distance travelled per type: SELECT type_id, MAX(col_speed / col_time) AS max_distance FROM speed_ratings GROUP BY type_id HAVING MAX(col_speed / col_time) ? ORDER BY dd Finally, if you really want distance, I assume you mean speed * time, not speed/time. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] trying to make sense of deadlocks
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Richard Yen Sent: Monday, February 09, 2009 4:18 PM To: pgsql-general@postgresql.org Subject: [GENERAL] trying to make sense of deadlocks Hi, I'm trying to make sense of a situation I ran into this morning. Apparently, there were numerous deadlocks (approx. 75 in a 30-min period) while procs were trying to write to a table (16634, account) in my database. Just to give you a sense of what's going on, process 22583 tried to do an insert, followed by an update, and the insert took 225925.724 ms: Might someone be able to help me make more sense of this? The two processes that are conflicting, can you list everything they do since the beginning of the transaction until the deadlock? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Strange limit and offset behaviour....
I have the following sql: SELECT * from table order by dato asc limit 20 offset 0 This gives me different rows than the 20 first rows when running the following sql: SELECT * from table order by dato asc Shouldn't the 20 first rows in the second sql statment be the same 20 rows that is returned in the first statement or am I missing something? Not necessarily. In your example query, if dato was not a unique column, and there were some duplicates, the top 20 values is not a defined set. Adding the offset clause might cause a different query plan, resulting in a different ordering of the duplicate values. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] field with Password
I would like to create a new table where one of the field would be a user password. Is there any data type for supporting this functionality? Something like Password DataType. I've taken a look of the available data types in PgAdmin Application and there is nothing similar to this. most commonly, passwords are stored as hashes, such as md5, rather than plaintext.'text' would be as suitable for this as anything, or bytea, if you want to store the hashes in binary. Thanks for your answers. Sorry for the questions but I'm new to Postgre :) The problem with a plain text password is that a user can see it by looking at the user table. Both suggest to use MD5. How can i use it? Any link, example about this would be very appreciated. Insert new users like this: insert into myusers (usernm, passwd) values ($user, MD5($pass)); So the paintext password is not stored. But you should still restrict access to this table. Revoke rights to regular users. When a user logs in, check for their access like this: select * from myusers where usernm=$user and passwd=MD5($pass); The hash of a particular password is always the same. To make this scheme more secure, you should add a salt before hashing. (You can find how to do this via google). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ramblings about password exposure (WAS: field with Password)
On Wed, Feb 04, 2009 at 09:34:56AM -0500, Raymond C. Rodgers wrote: You don't need to depend on an external library for this functionality; it's built right into Postgres. Personally, in my own apps I write in PHP, I use a combination of sha1 and md5 to hash user passwords, without depending on Postgres to do the hashing, but the effect is basically the same. Doing the hashing outside PG would reduce the chance of the password being exposed, either accidentally by, say, turning on statement logging, or maliciously. A general rule with passwords is to throw away any copy of a plain text password as quickly as possible, sending the password over to another process would go against this. Agreed. Another benefit of this is the hashing support in PHP is more flexible. I personally use the hash() function to get a SHA-256 hash instead of the weaker sha1 or md5. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves
On Saturday 31 January 2009 8:47:28 pm Adam Rich wrote: On Thu, 29 Jan 2009 13:16:17 + Gregory Stark stark(at)enterprisedb(dot)com wrote: So, what do people say? Is Postgres perfect in your world or does it do some things which rub you the wrong way? I see all the major ones have already been mentioned, so here's some minor ones. - lack of system-level and DDL triggers - inability to limit triggers to certain columns - inability to know the DML operation causing a trigger From: http://www.postgresql.org/docs/8.3/interactive/plpgsql-trigger.html TG_OP Data type text; a string of INSERT, UPDATE, or DELETE telling for which operation the trigger was fired. This is also available in plpythonu, I don't know about the other PL's. Thanks, I knew this was available for python perl PLs, I wasn't aware it was I plpgsql too. Still, it would be nice to have something akin to oracle's IF(UPDATING('col_name')) THEN - date_part/extract returning floats instead of integer Maybe this what you are looking for ?: http://www.postgresql.org/docs/8.3/interactive/datatype-datetime.html Note: When timestamp values are stored as double precision floating- point numbers (currently the default), the effective limit of precision might be less than 6. timestamp values are stored as seconds before or after midnight 2000-01-01. Microsecond precision is achieved for dates within a few years of 2000-01-01, but the precision degrades for dates further away. When timestamp values are stored as eight-byte integers (a compile-time option), microsecond precision is available over the full range of values. However eight- byte integer timestamps have a more limited range of dates than shown above: from 4713 BC up to 294276 AD. The same compile-time option also determines whether time and interval values are stored as floating-point or eight-byte integers. In the floating-point case, large interval values degrade in precision as the size of the interval increases. Nope, I mean if you use date_part to extract a piece of a date, you get a float instead of an integer. It trips me up everytime I try something like this: select * from table where (weekmask (1 date_part('DOW', $1))) 0 To my surprise, the operator fails because it requires an integer argument, but date_part provides only a double floating point. I realize this is documented as intended behavior, but why? Is there any scenario where DOW (or day, year, hour, or *any* field really) would be returning a fractional number? - parts of the SQL statement (e.g. 'for update of') requiring table aliases when present instead of table names. - lack of queryable high-water marks useful for tuning - lack of an auto-tuner, for that matter. - inability to log (e.g. long-running queries) to a table - lack of custom session-level variables (without editing postgresql.conf) - lack of autonomous transactions -- Adrian Klaver akla...@comcast.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves
On Thu, 29 Jan 2009 13:16:17 + Gregory Stark stark(at)enterprisedb(dot)com wrote: So, what do people say? Is Postgres perfect in your world or does it do some things which rub you the wrong way? I see all the major ones have already been mentioned, so here's some minor ones. - lack of system-level and DDL triggers - inability to limit triggers to certain columns - inability to know the DML operation causing a trigger - date_part/extract returning floats instead of integer - parts of the SQL statement (e.g. 'for update of') requiring table aliases when present instead of table names. - lack of queryable high-water marks useful for tuning - lack of an auto-tuner, for that matter. - inability to log (e.g. long-running queries) to a table - lack of custom session-level variables (without editing postgresql.conf) - lack of autonomous transactions -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indexing columns with low cardinality: persistent bitmap indexes?
Hello, I have a column with a small number of distinct values, indexing this one with a standard BTree is useless. How do I can index this column efficiently? I searched and it seems that pg doesn't support the creation of persistent bitmap indexes... Is that feature planned in next releases of pg? You have a few options based on your access patterns. If you tend to access just one of these an get them all at once, then either clusting on this value, or partitioning your table will help. How will clustering benefit this pattern? Won't a full table scan be required regardless of the table being clustered? And I thought the point of clustering was the organize the table by some indexed key, requiring fewer seeks and increasing the likelihood of the pages being in the cache if the index is never used in this case ( low cardinality ) would it still help? If you access your data using these values and other column values at the same time, then partial or multi-column indexes might help. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to use index in strpos function
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Tuan Hoang Anh Sent: Tuesday, December 30, 2008 10:49 PM To: pgsql-general@postgresql.org Subject: [GENERAL] How to use index in strpos function I have table command CREATE TABLE command And one index CREATE INDEX command_command ON command USING btree(upper(command)); And have a query use it explain select * from command where strpos('APCTPN1.EXE PN1', UPPER(command)) 0 AND UPPER(command) '' Seq Scan on command (cost=1.00..10015.26 rows=92 width=200) Filter: ((upper((command)::text) ''::text) AND (strpos('APCTPN1.EXE PN1'::text, upper((command)::text)) 0)) This command is called a lot, so i want to use index in it. How to use index on this command Is the first argument to strops always the same ('APCTPN1.EXE PN1') ? if so, you can create an index like this: CREATE INDEX strpos_command ON command USING (strpos('APCTPN1.EXE PN1', UPPER(command.command))) However, if the argument is different for each query, then you will not be able to utilize an functional index for this type of query. If you mean to query for starts with you can rewrite your query as: select * from command where UPPER(command.command) LIKE 'APCTPN1.EXE PN1%' However, if you mean to query for contains substring then a regular index or functional index will not help. A full-text index might help, but it is more complex to setup and use. The documentation for that is here: http://www.postgresql.org/docs/8.3/interactive/textsearch.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Ordering returned rows according to a list
Hi all, I'd like to make a single query that returns a number of rows using a 'WHERE id IN (list-of-ids)' condition, but I'd like the rows to be returned in the order in which the ids are given in the list. Is this possible? Depending on how many IDs you have in your list, you can accomplish this with a CASE statement: SELECT * FROM MYTABLE WHERE id IN (6, 9, 3) ORDER BY CASE id WHEN 6 then 1 WHEN 9 then 2 WHEN 3 then 3 END -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] View vs Constantly Updated Table
if i have a column that is a calculation, say a bank balance - sum of all the debits and credits...is it more efficient to make a view that executes the underlying calc query doing the math, or to create a table that has a column called balance that is updated for each transaction? so in the end select balance from view or select balance from table ? What are the pros cons ? How often are you using the bank balance value? If you're updating it for every transaction, you would be doing a lot of work computing values that may rarely if ever get used. That's an argument for the view route, since the computation only happens when necessary. The opposite argument, is how long does the computation take, and how quickly do you need it? The pre-computed value would obviously be much faster than waiting for it to be computed on the fly. Other things to keep in mind... you might want to make the balance calculation a separate function rather than building it into the table, unless it's used on *every* query. Also, if you would be taking any action with the value returned by the balance calculation, remember to lock any tables necessary to ensure the balance doesn't change between the time you compute it and the time you act on it. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multi-table CHECK constraint
On Wed, Dec 10, 2008 at 05:58:08PM -0600, Jason Long wrote: I need to add some complex constraints at the DB. For example. Do not allow a line item of inventory to be changed if it does not result in the same number of joints originally shipped. These will involve several tables. What is the best approach for this? You might want to talk to people who have done bookkeeping applications for PostgreSQL, or possibly even buy one of the proprietary PostgreSQL-based systems for it, as this stuff can be fiendishly tricky to get right. As a developer of just such a bookkeeping application, here's (IMHO) the best way to handle this: Wrap the dependent operations into one stored procedure, grant rights to that procedure and not to the underlying tables. If an operation (such as shipping product) requires multiple database queries and updates, wrapped in a transaction, don't place your trust in every user and/or application to do that properly. In your example, don't give the user or application UPDATE permission to the raw inventory or product tables, that's just asking for trouble. Instead, create a ship_product() procedure that takes all the steps required. (You'll also need others, adding inventory for example) There are many, many benefits to reap once you've made the commitment to doing this. More re-usable code, a stable API, atomic operations, faster transactions, less traffic over the wire, etc etc. I would still add critical constraints and triggers as a failsafe so an admin with rights can't accidentally introduce bad data to the system, but there's just no substitute for proper encapsulation. Plus, sometimes it's expensive or impossible to verify after the fact (in a constraint trigger) whether the transaction was valid, but just wrapping the stuff in a stored procedure is much simpler. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT ... ORDER BY problem
When we get windowing functions, a lot of this pain will go away :) Yes! Hope it won't be too long now. The patch seems to behave like it should now :) Hopefully we'll see it commited for 8.4. Though this does not look too much cleaner at least it's standard SQL: A preview for Madi: SELECT foo,bar FROM (SELECT foo,bar, ROW_NUMBER() OVER (PARTITION BY foo ORDER BY bar) AS pos FROM table ) AS t WHERE pos = 1 ORDER BY bar; Probably easier to understand what's going on in this one. David. Is Oracle's FIRST_VALUE function not a SQL standard? The way I would do this in Oracle looks like: SELECT foo, FIRST_VALUE(bar) OVER (PARTITION BY foo ORDER BY bar) as bar FROM table http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/functions059. htm -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] why hash on the primary key?
I'm seeing a lot of plans in my database that look like this: It seems very strange for the planner to decide to build an in-memory hash table on a column that is already indexed (the primary key, no less!). But this is happening A LOT - I often see plans where a majority of the joins are executed this way (and they're not all self-joins either...). It seems like the planner is concluding that it's going to need most or all of the pages in the table anyway, and that building a hash table as it goes is quicker than reading the index pages in from disk. On a simple query like the above, setting enable_seqscan to off or random_page_cost to 1 generates the expected plan: Experimentation shows this is actually about 25% faster. But, this is kind of a blunt instrument, and my attempts to fiddle with various parameters have not been real succesful in generating better plans for more complicated examples. Any suggestions/explanations? ...Robert Could you send the output of these two queries using explain analyze instead of plain explain? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] date range query help
Now user Ben has passed his mobile to user Josh and we issued Josh his mobile on 2008-10-15. 1. Is it possible for me to write a query that will have the fields call.call_id, call.datetime, mobile_custodian.user_id, call.mobile_no call.charge that will use call.datetime and lookup the date range from mobile_custodian.issue_date and mobile_custodian.return_date to identify the right user for each call? 2. Do I need to change the issue_date return_date fields to timestamp to perform the above? No, a date will work fine. Try this: select call.call_id, call.datetime, mobile_custodian.user_id, call.mobile_no call.charge from call, mobile_custodian where call.mobile_no = mobile_custodian.mobile_no and call.datetime between mobile_custodian.issue_date and mobile_custodian.return_date -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [SQL] [GENERAL] date range query help
sorry I get nothing :( Of course not. None of the dates you gave in the example overlap. But it should still have the 1st entry with the name Ben? Am I missing something? Ben's issue dates are in the year 2008. The first call entry is in the year 2007. There are no custodians with a matching issue date. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] time math - Bug or expected behavior?
I traced a bug in our application down to this basic operation: set timezone to 'US/Eastern'; select '11/02/2008'::timestamptz, '12:10'::time, '11/02/2008'::timestamptz + '12:10'::time; I have a date and a time stored separately and I want to combine them, and use them in some timezone-aware calculations. When I add the time 12:10 to the date 11/2/08, I expect the timestamp 11/2/08 12:10 but instead, I get 11/2/08 11:10. It's probably not coincidence that daylight saving time rolls back one hour on the morning of 11/2. Still, I would have expected the above behavior when adding an interval to a timestamp, but not a time. Is the time being cast to an interval before the add? Is there a better way to combine a date with a time and get a timestamptz ? (the values are stored in the database, and are not literals as in my example) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] index speed and failed expectations?
This query from the console: select * from stats order by start_time; takes 8 seconds before starting its output. Am I wrong in assuming that the index on start_time should make ORDER BY orders of magnitude faster? Or is this already fast enough? Or should I max up some memory (buffer) setting to achieve greater speeds? Not that the speed is crucial, just curious. Postgresql won't use the index for queries like this. Due to the MVCC implementation, the index does not contain all necessary information and would therefore be slower than using the table data alone. (What postgresql lacks is a first_row/all_rows hint like oracle) However, if you limit the number of rows enough, you might force it to use an index: select * from stats order by start_time limit 1000; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] a SQL query question
Hi, I have a table of the form aid pid nmol - --- --- 123 34 245 3445 323 100 478 12 545 14 645 200 7null null In general, aid is unique, pid and nmol are non-unique. What I'm trying to do is to select those rows where pid is not null, grouped by pid. So I'd get the following From within each group I'd like to select the row that has the maximum value of nmol. So I'd end up with aid pid nmol - --- --- 323 100 245 3445 478 12 I can easily do the first step, but am struggling to make the SQL for the second step. Any pointers would be appreciated Normally this is a difficult sort of thing to do, but it's made easier by a unique feature of Postgresql. Please try the following: SELECT DISTINCT ON (pid) aid, pid, nmol FROM tbl WHERE pid IS NOT NULL ORDER BY pid ASC, nmol DESC More information can be found here: http://www.postgresql.org/docs/8.3/interactive/sql-select.html#SQL-DISTINCT -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Optimizing a like-cause
Hello, I'm developing a autocomplete Feature using php and PostgreSQL 8.3. To fill the autocomplete box I use the following SQL Statement: select * from _table_ where upper( _field_ ) like '%STRING%'; This SQL Statement takes 900 ms on a Table with 300.000 entries. What can I do to speed up the Statement? What Index can I set? The open-ended search is what's killing you. Can you change your query to be like this? select * from _table_ where _field_ like 'STRING%'; That allows the database to use an index. You'll still have to either store the data already in upper-case format, or use a functional index on upper(field). http://www.postgresql.org/docs/8.3/interactive/indexes-expressional.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need help with constraint to prevent overlaps
I'm building a shift-scheduling app. I want to make a constraint in my database that prevents one human from being assigned to work two different jobs at the same time. In other words, if I schedule John Doe to mop bathrooms from 10 AM until 4 PM, some other manager will not be able to schedule John Doe for a 1 PM meeting. How can I do this with constraints? Would I need to write a trigger that does some 'select ... between ...' work? Matt, We do a lot of scheduling work, and the way we handle this is with a stored procedure. The only way to add something to the schedule is by calling the stored procedure. The procedure queries the existing schedule first, using the OVERLAPS function listed here: http://www.postgresql.org/docs/8.1/static/functions-datetime.html The overlaps is very easy and simple to use. If any conflicts are found, we can return information about what's overlapping to the application. You'll have to do some locking as well, so you don't create a race condition between when you check for a conflict and when you commit the new schedule item. (You could probably put the logic into a trigger too, but that would just be throwing away the insert or raising an exception, whereas with the stored procedure, we're returning an actual rowset of details regarding the overlapping schedule item). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complicated GROUP BY
Dear All, I have the following problem with grouping: I want to know the maximum in a group as well as the maximal element. Example: I have a table products_providers: product | provider | criteria_1 | criteria_2 I have a number of products, each of them from a several providers. Each product is described by two numeric values. I can easily select the best value for each product by a given criteria, like: select product, max(criteria_1) from products_providers group by product; but I need to know the best-scoring provider as well. Result I need should look like: product | best_provider_1 | best_criteria_1 | best_provider_2 | best_criteria_2 If it counts results may be split into two tables: one for the first and the other for the second criteria Can you help me with a painless solution? Is something like this what you're after? select * from products_proivders order by criteria_1 desc limit 1 You can get the best providers for both criteria using union like this: select * from ( select 'best_criteria_1' as name, product, provider, criteria_1, criteria_2 from products_proivders order by criteria_1 desc limit 1 ) x union select * from ( select 'best_criteria_2' as name, product, provider, criteria_1, criteria_2 from products_proivders order by criteria_2 desc limit 1 ) y -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Advice Wanted on Selecting Multi-row Data Requests in 10-Row Blocks
What I want to do is present the results of a query in a web page, but only 10 rows at a time. My PostgreSQL table has millions of records and if I don't add a LIMIT 10 to the SQL selection, the request can take too long. The worst case scenario is when the user requests all records without adding filtering conditions (e.g. SELECT * FROM MyTable;) That can take 10-15 minutes, which won't work on a web application. What I'm wondering is how in PostgreSQL do you select only the first 10 records from a selection, then the next 10, then the next, and possibly go back to a previous 10? Or do you do the full selection into a temporary table once, adding a row number to the columns and then performing sub-selects on that temporary table using the row id? Or do you run the query with Limit 10 set and then run another copy with no limit into a temporary table while you let the user gaze thoughtfully at the first ten records? I know how to get records form the database into a web page, and I know how to sense user actions (PageDown, PageUp, etc.) so I'm basically looking for techniques to extract the data quickly. In addition to LIMIT, Postgresql has an OFFSET clause: http://www.postgresql.org/docs/8.3/interactive/sql-select.html#SQL-LIMIT So if you want to show the records in pages of 10, your queries would look like this: SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 0; SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 10; SELECT * FROM MyTable ORDER BY my_key LIMIT 10 OFFSET 20; The offset clause tells postgresql how many rows to skip. Note that you always need an order by clause in there as well to get meaningful results. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] query planner weirdness?
Bob Duffey [EMAIL PROTECTED] writes: I'm seeing some query plans that I'm not expecting. The table in question is reasonably big (130,000,000 rows). The table has a primary key, indexed by one field (ID, of type bigint). Thus, I would expect the following query to simply scan through the table using the primary key: select * from T order by ID This is not wrong, or at least not obviously wrong. A full-table indexscan is often slower than seqscan-and-sort. If the particular case is wrong for you, you need to look at adjusting the planner's cost parameters to match your environment. But you didn't provide any evidence that the chosen plan is actually worse than the alternative ... I think I understand what Bob's getting at when he mentions blocking. The seqscan-and-sort would return the last record faster, but the indexscan returns the first record faster. If you're iterating through the records via a cursor, the indexscan behavior would be more desirable. You could get the initial rows back without waiting for all 130 million to be fetched and sorted. In oracle, there is a first-rows vs. all-rows query hint for this sort of thing. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] compiling, performance of PostGreSQL 8.3 on 64-bit processors
1. I have heard of problems arising from compiling PostGreSQL (8.3) on 64-bit processors. What sort of problems am I likely to encounter and how should I fix them? We are will run Linux Redhat 5 on a Dell PE2950 III Quad Core Xeon E54 2.33 GHz, and a Dell PE2950 III Quad Core Xeon L5335 2.0 GHz. 2. Are there performance problems running PostGreSQL 8.3 on a 64-bit processor? I have a few more questions on the 64-bit topic. Is there any benefit to running a 32-bit OS (rhel 5 in this case) on a server with more than 4 GB of memory? In other words, can the OS-level cache take advantage of more than 4 GB of memory? Can a process (such as PG backend) use more than 4 GB of shared memory on a 32-bit OS? Or is the 4 GB memory point the place where you normally transition to a 64-bit OS? For people with experience running postgresql on systems with 16+ GB of memory, what parameter settings have you found to be effective? (This would be a large database that's mostly read-only that we'd like to fit completely in memory) Is it possible to backup (pg_dump) from a 32-bit OS to a 64-bit OS, or is a plain SQL dump necessary? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] what are rules for?
Can you describe, or point me to somewhere which describes, all the things you can do with a rule that you can't do with a trigger? The only examples of rules in the manual are (1) logging, which I've just been told is much better done with a trigger, and (2) making update, insert, and delete work for a view, which is the only way to do it because views are not allowed to have update, insert, or delete triggers. However, as I have learned in several recent threads, this use of rules is fraught with difficulties, especially when the view has more than one table, and it seems that it would be much easier if triggers were just allowed on views. What is the real purpose of the rule system? You can read more about rules here: http://www.postgresql.org/docs/8.3/interactive/rules.html The documentation calls rules a query rewrite system, which helped me understand their use. Whereas triggers are called once per row modified, rules can modify or replace the actual query tree being executed. There are some fine examples here: http://www.postgresql.org/docs/8.3/interactive/rules-triggers.html Rules can be used to change a SELECT statement in-flight. This is actually how views are implemented in postgresql. One interesting example is having rules and triggers watching for deletes or updates on a table. If many rows are modified, rules can be faster. Take this statement: DELETE FROM mydata WHERE idval BETWEEN 1 and 2; Say this statement deletes 10,000 rows. The delete trigger would get called 10,000 times whereas the rule is essentially executed once, since it can share the WHERE clause of the user's query. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] hopefully a brain teaser, can't quite figure out query
-Original Message- The small table is a listing of county fips codes, their name, and the geometry for the county. Each fips is only listed once. The big table is multiple emissions for each county, the parameter for the emission, and the source code for the emission (scc). Each county in big tbale has many entries, variable number of pollutant types, variable number of scc's. SELECT small.fips, small.name, sum(big.value) FROM small, big WHERE small.fips in ( SELECT fips from big WHERE ((pollutant='co') AND ( (scc LIKE '21%') OR (scc LIKE '2301%') OR (scc LIKE '280100%') ) HAVING SUM(value 2000) ) GROUP BY small.fips, small.name; This is the query that isn't returning yet. If anyone has any questions, comments, or any suggestions at all, I'll do my best to respond ASAP. This sounds like what you want: SELECT small.fips, small.name, sum(big.value) as big_sum FROM small INNER JOIN big on small.fips = big.fips WHERE pollutant = 'co' AND (scc LIKE '21%' OR scc LIKE '2301%' OR scc LIKE '280100%') GROUP BY small.fips, small.name However, I'm not sure I understand this part: I would return the 123 fips ONLY if the value provided was less than the sum of the values for all scc's (500+550+1500+50 = 2600), as well as the sum for those values. Can you clarify? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] IN vs EXISTS
-Original Message- Hi all, I have been using IN clause almost exclusively until recently I tried to use EXISTS and gained significant performance increase without changing/creating any indexes: SELECT ... FROM a WHERE a.ref IN (SELECT b.id WHERE ...) vs SELECT ... FROM a WHERE EXISTS (SELECT 1 FROM b WHERE a.ref=b.id ...) Performance is at least few times better when EXISTS is used. Is it just PostgreSQL specific? IN should produce a different query plan than EXISTS. (You can run explain analyze on your queries, to see the different plans). Which one is faster depends on your data, and on your server. Also, what's faster on one dbms my be different than another. I've found that postgresql is usually slower than other databases for IN () queries, but handles EXISTS and inner joins (a third way of writing your queries above) quite quickly. SELECT a.foo FROM a INNER JOIN b on a.ref=b.id -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multiple result sets
I need that 5 queries, fired from the same ajax request to a web python application, see the same database snapshot. The driver is psycopg2. Since postgresql 8.2 functions can't return multiple result sets what would be the best aproach? You want to set your transaction isolation to Serializable. Then execute your 5 queries via the same connection, and the same Transaction. You can do that with this command: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multiple result sets
I need that 5 queries, fired from the same ajax request to a web python application, see the same database snapshot. The driver is psycopg2. Since postgresql 8.2 functions can't return multiple result sets what would be the best aproach? You want to set your transaction isolation to Serializable. Then execute your 5 queries via the same connection, and the same Transaction. You can do that with this command: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; I'm not sure i got it. You mean like this?: import psycopg2 as db dsn = 'host=localhost dbname=dbname user=user password=passwd' connection = db.connect(dsn) cursor = connection.cursor() cursor.execute('BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;'); rs1 = cursor.execute(query_1, (param1,)) rs2 = cursor.execute(query_2, (param2,)) cursor.execute('commit;'); cursor.close() connection.close() I tested it and it raises no exception. I just don't understand if a transaction persists between execute() calls. I am not familiar with the python library, but that looks correct to me. You can always test it by adding a sleep between your two queries and modifying the database from a console connection during the sleep. Note that I'm assuming your 5 queries are all read-only selects. If you're modifying data during your queries, and another concurrent database connection modifies the same data during your transaction, the later modifications will fail under serializable isolation. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Multiple result sets
I am not familiar with the python library, but that looks correct to me. You can always test it by adding a sleep between your two queries and modifying the database from a console connection during the sleep. Note that I'm assuming your 5 queries are all read-only selects. If you're modifying data during your queries, and another concurrent database connection modifies the same data during your transaction, the later modifications will fail under serializable isolation. Which one will fail? the second query or the serializable transaction. My understanding was that the serializable transaction will fail. Yes, serializable transactions fail if they attempt to modify data which has changed since the beginning of the transaction. If the OP's ajax call kicks off 5 queries in a serializable transaction, there is the potential for failure if two ajax calls initiate simultaneous serializable transactions modifying data. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] The optimizer is too smart for me - How can I trick it?
I've implemented Depesz's running total function (http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative- sum-in- one-query/) in my DB, which works great. Now what I want to do is get the running total for a certain statement and then do a subselect on that result so to get a non-zero start on a function. Instead, the optimizer sees what I'm trying to do, moves the where clause inside the subquery and my output becomes What can I do to tell the optimizer to keep its hands off my query or at least get it to not optimize? I think if you add a LIMIT/OFFSET clause to your subquery, the planner will leave it alone. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Finding records that are not there
I have two tables that have identical index fields, maplot and unitno, (both indexes span two columns) and I want to find all the records in the commcost table that don't have a corresponding record in the bldg file. The SQL I've tried is: select commcost.maplot, commcost.unitno from commcost where not exists(select 1 from commcost, bldg where commcost.maplot = bldg.maplot and commcost.unitno = bldg.unitno) order by commcost.maplot It returns no records although I know that there are records in commcost which do not match keys with records from bldg. You shouldn't put commcost in your inner select, since it's already in your outer select. Or try this, it's probably faster: Select commcost.maplot, commcost.unitno from commcost c left join bldg b on c.maplot = b.maplot and c.unitno = b.unitno where b.unitno is null -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Results of stored procedures in WHERE clause
I need to be able to do queries that restrict my result set to items belonging to a specified site and ignore all nodes that belong to different sites. To determine the ID of the site an item belongs to I wrote a stored procedure: This returns the ID of the root node for non-root nodes, the node's own ID for root-nodes and NULL for invalid IDs. I'm writing a query to do document searching (the version given is simplified to the problem in hand). SELECT cms_v_items.* , getroot (cms_v_items.itm_id) AS itm_root FROM cms_v_items WHERE itm_root = ?; I was hoping this query would return a set of items that had the same root node. Instead it throws an error, column itm_root does not exist. I'm obviously doing something wrong here, but what? I don't think you can reference an alias in the where clause. You'll have to repeat it, like this: SELECT cms_v_items.* , getroot (cms_v_items.itm_id) AS itm_root FROM cms_v_items WHERE getroot (cms_v_items.itm_id) = ?; Don't worry, I think with the function marked STABLE, postgresql is smart enough not to call it twice. I think you could further optimize your function doing something like this: SELECT cms_v_items.* , getroot (cms_v_items.itm_parent) AS itm_root FROM cms_v_items WHERE (itm_parent = ? OR getroot (cms_v_items.itm_parent) = ?; This will save one loop. Keep in mind, both queries will perform the getroot() function call for every single row in cms_v_items. You may want to experiment with a function that takes the root ID as a parameter and returns an array or a rowset, of just the items beneath that root. Then you'd use that function in your query by joining to the results or using = ANY. This might be faster: SELECT * from FROM cms_v_items WHERE itm_id = ANY(item_in_root(?)); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Conditional on Select List
Is it possible to do this? SELECT IF(COUNT(colname) 0, TRUE, FALSE) AS colname FROM table; What I want is to return a boolean, but when I tried SELECT COUNT(colname)::BOOLEAN FROM table; it says it cannot cast bigint to boolean. How about this? Logic al expresses are already returned as Boolean. Select COUNT(colname) 0 AS colname FROM table -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unable to add a new column to a table named in (look like a bug ...)
Could you just have a look to the output below? I try to add a column to a table named in (I know in is a reserved keyword but the table exists and I cannot change it). Postgresql complains that the column already exist but it didn't. Am I doing something wrong ? Try: ALTER TABLE in add column INDESCS VARCHAR[]; (note the double quotes and lower-case in) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Unable to add a new column to a table named in (look like a bug ...)
It worked Thanks!! But there is definitly something wrong with the error message I got (right?): reference=# alter table IN add column INDESCS VARCHAR[]; ERROR: column indescs of relation IN already exists I don't know, what do you see when you \d IN ? When you use double-quotes, capitalization is significant, So in and IN and In are all different tables. This error message means you already have an upper-case IN table and it already has a column INDESCS. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL injection, php and queueing multiple statement
Is there a switch (php side or pg side) to avoid things like: pg_query(select id from table1 where a=$i); into becoming pg_query(select id from table1 where a=1 and 1=1; do something nasty; -- ); Ideally, you'd use this: pg_query_params('select id from table1 where a=$1', array($i)); http://us2.php.net/manual/en/function.pg-query-params.php Alternately, you can do this: $i = pg_escape_string($i); pg_query( select id from table1 where a='$i' ); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Secure where in(a,b,c) clause.
I hope this isn't a FAQ, but does anyone have any suggestions as to how to make a query that selects using: where in(comma delimited list) secure from an sql injection point of view? As the length of the comma delimited list is highly variable I don't think I can use a prepared query to increase security. Prepared query, no.. but you can still use parameter binding. Determine how many parameters you need, and create a query like this: where in ($1, $2, $3, $4, $5) and then bind each of those parameters. This works well enough for small numbesr of parameters. Somebody else will have to answer if there's a better way for larger quantities. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Serial Data Type
I have just created a table using SELECT INTO however the PK was supposed to be a serial. It is now an integer. To make it a serial I just create the seq and set the default to be the nextval() of that sequence right? is there anything else I need to do? You'll want to do this: ALTER SEQUENCE table_col_id_seq OWNED BY table.col_id; http://www.postgresql.org/docs/8.3/interactive/sql-altersequence.html It'll maintain the transactional safety of a serial created default, right? I.e., it'll not rollback seq values on a transaction abortion will it? Yes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Update Join ?
Does Postgres allow updates based on the context of a sub-query, something like the sample below ? Yes, Update real_tab set real_tab.data_desc = temp_tab.data_desc From temp_tab Where real_tab.keyID = temp_tab.keyID (don't repeat your updated table in the from list unless you Mean to self-join) http://www.postgresql.org/docs/8.3/interactive/sql-update.html -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Need help on how to backup a table
Hi all, I am trying to backup a large table with about 6 million rows. I want to export the data from the table and be able to import it into another table on a different database server (from pgsql 8.1 to 8.2). I need to export the data through SQL query 'cause I want to do a gradual backup. Does pgsql have a facility for this? Thanks in advance for your reply. Here's an easy solution: psql -c COPY command here dbname | ssh [EMAIL PROTECTED] dd of=/path/tbl.backup You can run this from your local server, and immediately pipe it over a secure shell to the other server, and write it to a file there. The /path/ you specify is local to the remote server. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL question
I have a table that has 3 date columns : create table xyz ( xyz_id integer, date1 timestamp, date2 timestamp, date3 timestamp ) I want to select in a query the xyz_id and the max date column for each row something like : create table temp2 as select xyz_id (max date?) where ... Is this what you want? Select xyz_id, greatest(date1,date2,date3) from xyz where... http://www.postgresql.org/docs/8.3/interactive/functions-conditional.html#AE N14508 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Survey: renaming/removing script binaries (createdb, createuser...)
Oh, then there should have been some options in the survey along the lines of things are fine how they are. Oh, a bit of answer-forcing wasn't beneath him. Ummm... Isn't that what Option A is about ? 1) What type of names do you prefer? --- a) old notation - createdb, createuser ... b) new one with pg_ prefix - pg_createdb, pg_creteuser ... c) new one with pg prefix - pgcreatedb, pgcreateuser ... d) remove them - psql is the solution e) remove them - pgadmin is the solution -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Survey: renaming/removing script binaries (createdb, createuser...)
Please let us know your meaning, thanks Zdenek Kotala 1. c 2. a 3. other = pginitdb, to be consistent with pgcreatedb,etc 4. a -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] casting from integer to boolean
Thanks Richard. Is there a way to do it without changing the INSERT command? As I mentioned, there are many more columns of different types, so finding and replacing the VALUES would be very difficult. Can you import the data into a holding table (with columns defined as integer) first, and then use a SQL statement to insert from there into the final destination table (casting in the process) ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to sort strings containing a dot?
By the way, I have just inserted a duplicate. Then I have run the select statement with distinct and I got an error. How one can solve this? Does this work? select distinct name from ( select name from t order by replace(name, '.', 'a')) as t2 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL to Postgres question
The short answer is that Adam's statement is wrong, or at least misleading. Sorry Tom, I wasn't trying to do either. Joshua Drake (who I understand to be a reliable source of postgresql information) said that applying a sequence to a column after creation created issues, versus using the serial type which did not. That seemed misleading to me, since it's *exactly* what pg_dump does in 8.3. All I did was point that out, which I'd hardly call complaining and definitely not wrong. My point was that there was nothing special about serial in 8.3 Nothing misleading about that either. - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL to Postgres question
I am not sure about 8.3 but certainly earlier releases of PostgreSQL would have specific dependency issues when a sequence was applied to a a column after the fact, versus using the serial or bigserial psuedo-types. I'd like to point out that using pg_dump does in fact apply sequences to columns after the fact. (at least in 8.3) Columns lose their serial designation after each backup/restore (and therefore during version upgrades) mydb=# create table foo(id serial, bar varchar); NOTICE: CREATE TABLE will create implicit sequence foo_id_seq for serial column foo.id CREATE TABLE Then, pg_dump produces: -bash-3.00$ pg_dump -s --table=foo mydb CREATE TABLE foo ( id integer NOT NULL, bar character varying ); CREATE SEQUENCE foo_id_seq START WITH 1 INCREMENT BY 1 NO MAXVALUE NO MINVALUE CACHE 1; ALTER SEQUENCE foo_id_seq OWNED BY foo.id; ALTER TABLE foo ALTER COLUMN id SET DEFAULT nextval('foo_id_seq'::regclass); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.0 upgrade
Pick your OS/Arch from this list, and click to it: http://yum.pgsqlrpms.org/rpmchart.php Then click to C at the top, and download the compat package. Devrim, I clicked on my OS (RHEL/CentOS 4 - x86) Then on C as you said, But the RPM list still only contains the compat-postgresql-libs-3-2 package, when I'm looking for compat-postgresql-libs-4-2 Any other ideas? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.0 upgrade
No, you need compat-3, not compat-4. For example: [EMAIL PROTECTED] ~]# yum install php-pgsql snip -- Processing Dependency: libpq.so.3 for package: php-pgsql I have applications that depend on libpq.so.4 Where do I get that, if not compat-postgresql-libs-4-2 ?? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] 8.3.0 upgrade
I just finished upgrading my production DB to 8.3.0. Everything went smoothly, but I thought of a few questions. After the upgrade, while restoring my backup to the new version, I got this error message: ERROR: role postgres already exists I assume this is nothing to be concerned about. But is there something I could have done to avoid this error? (I think I followed the upgrade instructions to the letter). Is there any scenario where the postgres role wouldn't exist? (should pg_dumpall exclude it?) Moving on... In step 6 of the upgrade instructions, it says: Restore your previous pg_hba.conf and any postgresql.conf modifications. Perhaps this should also mention pg_ident.conf since I restored the two mentioned files, but still couldn't connect. The third completely escaped my mind until I ran a diff on the old new data directories. Next, one of my apps failed because of a dependency on libpq.so.4. During previous upgrades, I remedied that by installing this package: compat-postgresql-libs-4-2PGDG.rhel4 But it seems under the 8.3.0 binary downloads, this package is no longer available. The only compat package is compat-postgresql-libs-3 which of course includes only libpq.so.3 so I had to browse older releases to find the missing version which I thought seemed a little odd. Am I missing something? Finally, regarding the new HOT feature. The release notes say that benefits are realized if no changes are made to indexed columns. If my updates include *all columns* (the SQL is generated dynamically) but the new value matches the old value for all *indexed* columns, do I still reap the benefits of HOT? Thanks! Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.0 upgrade
Thanks to Pavan for the answer regarding HOT. Does anybody have an answer regarding the postgres role or compat lib ? * From: Adam Rich adam(dot)r(at)sbcglobal(dot)net * To: pgsql-general(at)postgresql(dot)org * Subject: 8.3.0 upgrade * Date: Mon, 17 Mar 2008 02:13:55 -0500 I just finished upgrading my production DB to 8.3.0. Everything went smoothly, but I thought of a few questions. After the upgrade, while restoring my backup to the new version, I got this error message: ERROR: role postgres already exists I assume this is nothing to be concerned about. But is there something I could have done to avoid this error? (I think I followed the upgrade instructions to the letter). Is there any scenario where the postgres role wouldn't exist? (should pg_dumpall exclude it?) Moving on... In step 6 of the upgrade instructions, it says: Restore your previous pg_hba.conf and any postgresql.conf modifications. Perhaps this should also mention pg_ident.conf since I restored the two mentioned files, but still couldn't connect. The third completely escaped my mind until I ran a diff on the old new data directories. Next, one of my apps failed because of a dependency on libpq.so.4. During previous upgrades, I remedied that by installing this package: compat-postgresql-libs-4-2PGDG.rhel4 But it seems under the 8.3.0 binary downloads, this package is no longer available. The only compat package is compat-postgresql-libs-3 which of course includes only libpq.so.3 so I had to browse older releases to find the missing version which I thought seemed a little odd. Am I missing something? Finally, regarding the new HOT feature. The release notes say that benefits are realized if no changes are made to indexed columns. If my updates include *all columns* (the SQL is generated dynamically) but the new value matches the old value for all *indexed* columns, do I still reap the benefits of HOT? Thanks! Adam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Column Statistics - How to dertermine for whole database
Is there a query to pg_catalog tables to find out which table/column has the stat level not at default in 1 sweep? Try this: select c.relname, a.attname, attstattarget from pg_catalog.pg_attribute a, pg_catalog.pg_class c, pg_catalog.pg_namespace n where a.attrelid = c.oid and c.relnamespace=n.oid and n.nspname = 'public' and a.attnum 0 The value -1 means to use the default (set in postgreql.conf) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to convert postgres timestamp to date: yyyy-mm-dd
I need to convert postgres timestamp to date format -mm-dd in a sql statement. pt.created_date below is timestamp format i.e ... WHERE pt.created_date = '2008-01-21' Any help would be greatly appreciated. Try this: WHERE pt.created_date = '2008-01-21'::date -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Efficiently storing a directed graph
I'm not married to using SQL: are there other efficient solutions to store directed graphs? Could I hack something up in Perl or Ruby and then serialize my in-memory graph to a file (for efficient saving/reloading)? As far as a perl solution, I would suggest posting your problem on perlmonks.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Confused about CASE
I wanted to use the following statement to translate the relkind column to a more descriptive value: select c.relname case when c.relkind in ('t','r') then 'table' when c.relkind = 'i' then 'index' when c.relkind = 'S' then 'sequence' when c.relkind = 'v' then 'view' else c.relkind end as mykind from pg_class c ; The idea is that for anything else than 't', 'r', 'i', 'S' or 'v' it should simply return the value of relkind. In the other cases I want my value. But for some reason this returns the value of relkind for all rows. When I remove the else c.relkind part, it works as expected. I agree, this seems confusing. I found a section of the doc that caught my eye: The data types of all the result expressions must be convertible to a single output type. Which led me to try this, which works: select c.relname, case when c.relkind in ('t','r') then 'table' when c.relkind = 'i' then 'index' when c.relkind = 'S' then 'sequence' when c.relkind = 'v' then 'view' else c.relkind::text end as mykind from pg_class c ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Confused about CASE
The data types of all the result expressions must be convertible to a single output type. The type of the field pg_class.relkind appears to be char which is described in the notes as: The type char (note the quotes) is different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs as a poor-man's enumeration type. http://www.postgresql.org/docs/8.3/interactive/datatype-character.html But one would expect char to be convertible to text for the purposes of CASE. Both implicit and explicit cast to text seems to work fine. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] no-arg cluster and locks ...
Can you provide more details? pg_locks, pg_stat_activity, the deadlock message? (Hmm, it would be helpful if the deadlock checker were to save the pg_locks contents and perhaps pg_stat_activity in a file, whenever a deadlock is detected.) Great idea! As somebody who's spent hours tracking down deadlocks recently, I'd love to have a configurable deadlocks.log file capability. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] the feasibility of sending email from stored procedure in Postgres
I would instead queue messages (or suitable information about them) in a table, and have a process outside PostgreSQL periodically poll for them Why poll when you can wait? http://www.postgresql.org/docs/8.2/interactive/sql-notify.html ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Storing images as BYTEA or large objects
I have two options for storing this data: As BYTEA or as large objects. Is it true that if you update a row containing a large BYTEA value, (even if you're not updating the BYTEA field itself, just another field), it requires the entire BYTEA value to be copied to a new row (because of MVCC) ? Or is this not true because of TOAST? If true, would this have an impact on the buffer cache and/or checkpoints ? (You could always separate out the BYTEA values to their own table by themselves to avoid this drawback) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] help optimizing query
It seems to do the job, but how good is it in the long run? Any way I could tweak it? I think this form will work the best: SELECT u.login, MAX(s.stop_time) AS last_use_time FROM users u, stats s WHERE u.id=s.user_id AND u.status='3' AND u.next_plan_id IS NULL GROUP BY u.login HAVING MAX(s.stop_time) (now() - interval '1 month') ORDER BY last_use_time; ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Subquery Factoring ?
Are there any plans to support materialized subquery / factoring (sql-99 WITH) in Postgresql? I am spoiled with this feature in oracle, and find myself wishing I had it in postgresql more and more. It *seems* to an outsider like a relatively easy addition. I searched the archives but only found a brief mention of a syntax-support-only patch from last winter. Adam ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Subquery Factoring ?
Gregory, Are you using it just to avoid retyping a complex subquery? Or do you expect that the feature will reduce the execution time by avoiding re- executing the subquery for each call site in the query? The only situation where I rely on this currently is when my main/outer query references or executes an expensive subquery multiple times. The main goal is to speed the whole thing up by only executing the subquery once, but making the query shorter and easier to read is a nice side effect. How disappointing would it be if the WITH clause acted as an optimization barrier, preventing WHERE clauses from being pushed down and potentially using indexes? Do you mean that WHERE clauses in the main/outer query are not used to optimize the subquery? This would be a great feature down the road, but I would be quite happy without it. Or if the query ended up not needing the data in the WITH subquery but the query had to execute it anyways? Hmmm... I would expect the query be executed once, regardless. Indeed, I would think that executing it more than once would produce strange results. I'm counting on all references to the subquery to contain the same data. Another way of looking at this question is: if you called some volatile function from the subquery such as one which printed diagnostic messages or accessed some remote service, how many times would you expect it to be called? Would you expect the feature to guarantee that the function would only be called once or would it be ok if it were called 0 times if the subquery data was never needed or many times if the optimizer thought that would be faster? I would expect it to be called exactly once. I think that logically, I expect WITH to work as shorthand for creating a temporary table, filling it with my subquery, and then executing the rest of the main/outer query. (As opposed to creating a temporary view) Adam ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Oracle Analytical Functions
I'm trying to replicate the use of Oracle's 'lag' and 'over partition by' analytical functions in my query. I have a table (all_client_times) such as: and I would like to create a new view that takes the first table and calculates the time difference in minutes between each row so that the result is something like: I thought of a another way of doing this. In my tests, it's a little faster, too. DROP SEQUENCE if exists seq1; DROP SEQUENCE if exists seq2; CREATE TEMPORARY SEQUENCE seq1 CACHE 1000; CREATE TEMPORARY SEQUENCE seq2 CACHE 1000; select a.client_id, b.datetime, a.datetime as previousTime, (b.datetime - a.datetime) as difftime from (select nextval('seq1') as s, client_id, datetime from all_client_times order by client_id, datetime OFFSET 0) as a inner join (select nextval('seq2') as s, client_id, datetime from all_client_times order by client_id, datetime OFFSET 0) as b on a.s=(b.s-1) where a.client_id=b.client_id ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Oracle Analytical Functions
Hi Willem, for some reason the order by's aren't working. Could you provide more details? Do you get a specific error message? only returning 658 rows instead of the 750K. You should not expect the same row count in both source table and result set. Even in your example -- you provided 8 source rows, and 4 result rows. You can determine the correct number of results via the number of records, related to client_ids having two or more records in all_client_times, minus one. It may be true that you have 750k records but only 658 rows that satisfy this requirement. What do you get for this query? select count(*) from ( select client_id, count(*) as rows from all_client_times group by client_id having count(*) 1 ) as x Adam ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Oracle Analytical Functions
Ah, ok. I see what's happening. The data is retrieved from the tables, and the sequence values are added, PRIOR to the order by, so that after the order by, they are no longer sorted. (The same thing can happen in Oracle with ROWNUM). You can go the sorted view route, or just an inline view, like this: select a.client_id, b.arbnum, a.arbnum as previousarbnum, (b.arbnum - a.arbnum) as diffarbnum from (select nextval('seq1') as s, * from (select client_id, arbnum from arb_test order by client_id, arbnum OFFSET 0) as y OFFSET 0) as a inner join (select nextval('seq2') as s, * from (select client_id, arbnum from arb_test order by client_id, arbnum OFFSET 0)as z OFFSET 0) as b on a.s=(b.s-1) where a.client_id=b.client_id; -Original Message- From: Willem Buitendyk [mailto:[EMAIL PROTECTED] Sent: Thursday, January 31, 2008 2:48 PM To: Adam Rich Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Oracle Analytical Functions Here is a little test example. It seems that the second order by condition is not working - in this case datetime. create table arb_test ( client_id integer, arbnum integer); insert into arb_test values (2,1); insert into arb_test values (2,33); insert into arb_test values (2,6); insert into arb_test values (2,76); insert into arb_test values (2,111); insert into arb_test values (2,10); insert into arb_test values (2,55); insert into arb_test values (7,12); insert into arb_test values (7,6); insert into arb_test values (7,144); insert into arb_test values (7,63); insert into arb_test values (7,87); insert into arb_test values (7,24); insert into arb_test values (7,22); insert into arb_test values (1,14); insert into arb_test values (1,23); insert into arb_test values (1,67); insert into arb_test values (1,90); insert into arb_test values (1,2); insert into arb_test values (1,5); insert into arb_test values (5,8); insert into arb_test values (5,42); insert into arb_test values (5,77); insert into arb_test values (5,9); insert into arb_test values (5,89); insert into arb_test values (5,23); insert into arb_test values (5,11); DROP SEQUENCE if exists seq1; DROP SEQUENCE if exists seq2; CREATE TEMPORARY SEQUENCE seq1 CACHE 1000; CREATE TEMPORARY SEQUENCE seq2 CACHE 1000; select a.client_id, b.arbnum, a.arbnum as previousarbnum, (b.arbnum - a.arbnum) as diffarbnum from (select nextval('seq1') as s, client_id, arbnum from arb_test order by client_id, arbnum OFFSET 0) as a inner join (select nextval('seq2') as s, client_id, arbnum from arb_test order by client_id, arbnum OFFSET 0) as b on a.s=(b.s-1) where a.client_id=b.client_id; --create or replace view arb_view as select * from arb_test order by client_id, arbnum; Here are the results: client_id | arbnum | previousarbnum | diffarbnum ---+++ 1 | 23 | 14 | 9 1 | 67 | 23 | 44 1 | 90 | 67 | 23 1 | 2 | 90 |-88 1 | 5 | 2 | 3 2 | 33 | 1 | 32 2 | 6 | 33 |-27 2 | 76 | 6 | 70 2 |111 | 76 | 35 2 | 10 |111 | -101 2 | 55 | 10 | 45 5 | 42 | 8 | 34 5 | 77 | 42 | 35 5 | 9 | 77 |-68 5 | 89 | 9 | 80 5 | 23 | 89 |-66 5 | 11 | 23 |-12 7 | 6 | 12 | -6 7 |144 | 6 |138 7 | 63 |144 |-81 7 | 87 | 63 | 24 7 | 24 | 87 |-63 When I used a sorted view: create or replace view arb_view as select * from arb_test order by client_id, arbnum; and redid it the results are: client_id | arbnum | previousarbnum | diffarbnum ---+++ 1 | 5 | 2 | 3 1 | 14 | 5 | 9 1 | 23 | 14 | 9 1 | 67 | 23 | 44 1 | 90 | 67 | 23 2 | 6 | 1 | 5 2 | 10 | 6 | 4 2 | 33 | 10 | 23 2 | 55 | 33 | 22 2 | 76 | 55 | 21 2 |111 | 76 | 35 5 | 9 | 8 | 1 5 | 11 | 9 | 2 5 | 23 | 11 | 12 5
Re: [GENERAL] Dump schema without the functions
how can I dump a schema with all tables, but without the functions? Is there a way to do it, or do I have to manually drop the functions later when having used the pg_restore? Stef, You can edit the data between dump and restore, to comment out the function references. Or, you can use the -L argument with pg_restore to provide a list of the specific items you want to restore. For example: pg_dump -Fc mydb db.dump pg_restore -l db.dump | grep -v FUNCTION db.nofunc.dump pg_restore -d newdb db.nofunc.dump (assuming the word FUNCTION doesn't appear elsewhere in your schema object names. If it does, you might try appending the schema, such as grep -v FUNCTION public) Adam ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Oracle Analytical Functions
and I would like to create a new view that takes the first table and calculates the time difference in minutes between each row so that the result is something like: client_id,datetime, previousTime, difftime 122,2007-05-01 12:01:00, 2007-05-01 12:00:00, 1 455,2007-05-01 12:03:00, 2007-05-01 12:02:00, 1 455,2007-05-01 12:08:00, 2007-05-01 12:03:00, 5 299,2007-05-01 12:34:00, 2007-05-01 12:10:00, 24 Any idea how I could replicate this in SQL from PG. Would this be an easy thing to do in Pl/pgSQL? If so could anyone give any directions as to where to start? You can create a set-returning function, that cursors over the table, like this: CREATE OR REPLACE FUNCTION lagfunc( OUT client_id INT, OUT datetime timestamp, OUT previousTime timestamp, OUT difftime interval) RETURNS SETOF RECORD as $$ DECLARE thisrow RECORD; last_client_id INT; last_datetime timestamp; BEGIN FOR thisrow IN SELECT * FROM all_client_times ORDER BY client_id, datetime LOOP IF thisrow.client_id = last_client_id THEN client_id := thisrow.datetime; datetime := thisrow.datetime; previousTime := last_datetime; difftime = datetime-previousTime; RETURN NEXT; END IF; last_client_id := thisrow.client_id; last_datetime := thisrow.datetime; END LOOP; RETURN; END; $$ LANGUAGE plpgsql; select * from lagfunc() limit 10; select * from lagfunc() where client_id = 455; Here I used an interval, but you get the idea. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Oracle Analytical Functions
I tried this function but it keeps returning an error such as: ERROR: invalid input syntax for integer: 2007-05-05 00:34:08 SQL state: 22P02 Context: PL/pgSQL function lagfunc line 10 at assignment Whoops, this line: client_id := thisrow.datetime; Should be: client_id := thisrow.client_id; ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Getting the count(*) from two tables and two date ranges in same query
Resulting in 4 columns in the ResultSet like: count(*)_from_table2_between_fromdate1_and_todate1 = X count(*)_from_table2_between_fromdate2_and_todate2 = Y count(*)_from_table3_between_fromdate1_and_todate1 = Z count(*)_from_table3_between_fromdate2_and_todate2 = V Is this possible? Select t1.id, sum(case when t2.date between d1 and d2 then 1 else 0 end) as sum1, sum(case when t3.date between d1 and d2 then 1 else 0 end) as sum2 from t1, t2, t3 where t1.id=t2.id and t2.id = t3.id group by t1.id ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Simple row serialization?
I'd like to implement some simple data logging via triggers on a small number of infrequently updated tables and I'm wondering if there are some helpful functions, plugins or idioms that would serialize a row If you're familiar with perl, you can try PL/Perl. http://www.postgresql.org/docs/8.2/interactive/plperl-triggers.html Here's an example (untested). If you're using quotes and colons as delimeters, you may also need to escape those in your data. CREATE OR REPLACE FUNCTION log_change() RETURNS trigger AS $$ my ($old_serialized, $new_serialized); foreach my $col (keys %{$_TD-{old}}) { $old_serialized .= ' . $col .':' . $_TD-{old}{$col} . ',; } foreach my $col (keys %{$_TD-{new}}) { $new_serialized .= ' . $col .':' . $_TD-{new}{$col} . ',; } my $qry = spi_prepare('insert into log_tbl values ($1,$2)', VARCHAR, VARCHAR); spi_exec_prepared($qry, $old_serialized, $new_serialized); spi_freeplan($qry); return; $$ LANGUAGE plperl; ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] altering foreign keys
In my database, I have a core table that nearly all other tables key against. Now I need to adjust all of those foreign keys to add a on update cascade action. Is there a way to alter the existing keys? (it didn't jump out at me in the manual) If not, is there a serious issue preventing this feature? If I have to drop and re-create all of the foreign keys, is it possible to wrap the whole operation in a transaction without risking invalid inserts in the referring tables? (I come from an Oracle background, where DDL causes an implicit commit) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] altering foreign keys
In my database, I have a core table that nearly all other tables key against. Now I need to adjust all of those foreign keys to add a on update cascade action. Is there a way to alter the existing keys? (it didn't jump out at me in the manual) Would it be possible to modify confupdtype in pg_constraint ? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Locking concurrency - best practices
I have a parent_tbl and dozens of data tables, with foreign keys referencing the PK of parent_tbl (one-to-many). There are 100+ users accessing the application, usually (but not always) each user is working on a different record in parent_tbl. (this would seem like a pretty standard scenario for a lot of apps) Each user performs multiple queries in a transaction, reading and modifying the data in parent_tbl and multipe data tables before commiting. I need the data to be consistent during and after the transaction. (I basically need a way to lock a row in parent_tbl, and all rows in the data tables referencing that row, and prevent new rows from being inserted that reference that row). To guard against this, I added FOR UPDATE to queries against the parent_tbl and LOCK TABLE IN EXCLUSIVE MODE before queries against all of the data tables. This works, except it slows down the entire application because all transactions are serialized. Even users who are working on seperate records in parent_tbl are not allowed to proceed simultaneously. This is not ideal, the vast majority of access to this database is users working on separate records. Should I drop the LOCK TABLE statements completely? As long as *every* part of the application that modifies data obtains a FOR UPDATE lock on the parent table's record first, there shouldn't be any concurrency issues. But, I realize I'm really only implementing advisory locking, and there's nothing preventing data corruption from any application that forgets or leaves out the FOR UPDATE. Is this the best practice for dealing with this situation? Should I be using real advisory locks instead of FOR UPDATE ? What are the pros cons of each? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match