Re: [SQL] Tweak sql result set... ?

2009-07-30 Thread Jasmin Dizdarevic
i think jasen is thinking of manipulating the result set in your programming
enviroment not in the database.
btw from the point of "clean programming" it's a bad idea to integrate
html-elements directly into your database quereis. you're mixing data layer
and design layer.

what do you mean with sometimes?
2009/7/29 Axe 

>
> > > Any ideas on how to achieve this? I would like to let the original sql
> > > code stay original. I can prepare postgres before executing the sql if
> > > this makes it easier to acheive the goal
> >
> > Have a look at CREATE RULE.
> >
> > Tim
> >
> Ok, you mean I could create a rule for the table, then I let the
> script go into my "black box",
> do the original selects, but get a manipulated result set back, then I
> drop the rule so
> that the blackbox of scripts can get the not manipulated result set
> back?
>
> I need to sometimes get the result "output from query" and
> sometimes not
> and it would be awesomw to get it with the same query, but maybe by
> setting
> a rule or similar. Best from performance view would be to make
> something
> more permanent. Is it possible to activate/deactivate a rule?
>
> Axe
>
> --
> Sent via pgsql-sql mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>



-- 
Mit freundlichen Grüßen

Dizdarevic Jasmin
Sonnenbergstr. 3
6714 Nüziders, AUT

[email protected]
+43 664 411 79 29


[SQL] SELECT max(time) group by problem

2009-07-30 Thread Heigo Niilop
hi,



I have table



CREATE TABLE table

(

  id integer NOT NULL,

  timest timestamp with time zone NOT NULL,

  db_time timestamp with time zone NOT NULL DEFAULT now(),

  "values" text[],

  CONSTRAINT table_pkey PRIMARY KEY (id, timest)

)



„id“ have foreign key with table1



and when I try to do



SELECT MAX(table.timest)  FROM table, table1  WHERE

table.id=table1.id and

table1.id in (1,2,3) GROUP BY table.id



then it is terrible slow, when I use strange syntax



SELECT table.timest  FROM table,table1 WHERE

table.id=table1.id and table1.id in(1,2,3) and table.timest=

(SELECT max(timest) FROM table WHERE table.id=table1.id)



I receive all needed data very fast.



My questions are

1)  why this first query is slow and what I can do to make it faster
(some more indexes??)?

2)  what kind of danger I have with second query (so far I have right
data)?





I have Postgres 8.3 and table have over million rows.



Regards,

Heigo


Re: [SQL] Tweak sql result set... ?

2009-07-30 Thread Tim Landscheidt
(anonymous) wrote:

>> > Any ideas on how to achieve this? I would like to let the original sql
>> > code stay original. I can prepare postgres before executing the sql if
>> > this makes it easier to acheive the goal

>> Have a look at CREATE RULE.

> Ok, you mean I could create a rule for the table, then I let the
> script go into my "black box",
> do the original selects, but get a manipulated result set back, then I
> drop the rule so
> that the blackbox of scripts can get the not manipulated result set
> back?

No, I meant that you should have a look at CREATE RULE. From
a design perspective, I'd probably rename the old table and
put an updatable view in its place.

> I need to sometimes get the result "output from query" and
> sometimes not
> and it would be awesomw to get it with the same query, but maybe by
> setting
> a rule or similar. Best from performance view would be to make
> something
> more permanent. Is it possible to activate/deactivate a rule?

Of course, DROP RULE.

Tim


-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] SQL report

2009-07-30 Thread Rob Sargent
I would be curious to know the performance curve for let's say 20K, 40K 
, 60K, 80K, 100K records.  And what sort of indexing you have, whether 
or not it's clustered, re-built and so on.


One could envision partitioning the status table such that recent 
records were grouped together (on the assumption that they will be most 
frequently "reported").


[email protected] wrote:

I have the following senario.

I have a tracking system. The system will record the status of an 
object regularly, all the status records are stored in one table. And 
it will keep a history of maximum 1000 status record for each object 
it tracks. The maximum objects the system will track is 100,000. Which 
means I will potentially have a table size of 100 million records.


I have to generate a report on the latest status of all objects being 
tracked at a particular point in time, and also I have to allow user 
to sort and filter on different columes in the status record displayed 
in the report.


The following is a brief description in the status record (they are 
not actual code)


ObjectRecord(
objectId bigint PrimaryKey
desc varchar
)

StatusRecord (
id bigint PrimaryKey
objectId bigint indexed
datetime bigint indexed
capacity double
reliability double
efficiency double
)

I have tried to do the following, it works very well with around 
20,000 objects. (The query return in less than 10s) But when I have 
100,000 objects it becomes very very slow. (I don't even have patience 
to wait for it to return I kill it after 30 mins)


select * from statusrecord s1 INNER JOIN ( SELECT objectId , 
MAX(datetime) AS msdt FROM statusrecord WHERE startDatetime <= 
1233897527657 GROUP BY objectId ) AS s2 ON ( s1.objectId = s2.objectId 
AND s1.datetime = s2.msdt ) where ( capacity < 10.0 ) order by 
s1.datetime DESC, s1.objectId DESC;


I did try to write a store procedure like below, for 100,000 objects 
and 1000 status records / object, it returns in around 30 mins.


CREATE OR REPLACE FUNCTION getStatus(pitvalue BIGINT) RETURNS SETOF 
statusrecord AS $BODY$

DECLARE
id VARCHAR;
status statusrecord%ROWTYPE;
BEGIN
FOR object IN SELECT * FROM objectRecord
LOOP
EXECUTE 'SELECT * FROM statusrecord WHERE objectId = ' || 
quote_literal(object.objectId) ||
' AND datetime <= ' || quote_literal(pitvalue) || ' ORDER BY datetime 
DESC'

INTO status;
IF FOUND THEN
RETURN NEXT status;
END IF;
END LOOP;
RETURN;
END
$BODY$ LANGUAGE plpgsql;

Just wanna to know if anyone have a different approach to my senario. 
Thanks alot.


John 


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Tweak sql result set... ?

2009-07-30 Thread Rob Sargent
I agree.  All clients issue the same sql and deal with it as they will.  
The psql client for example can format the results in various ways (pset 
variations etc).  Your client(s) need(s) to interpret their identical 
results differently.  Doesn't seem to me to be the job of SQL?


Jasmin Dizdarevic wrote:
i think jasen is thinking of manipulating the result set in your 
programming enviroment not in the database.
btw from the point of "clean programming" it's a bad idea to integrate 
html-elements directly into your database quereis. you're mixing data 
layer and design layer.
 
what do you mean with sometimes?

2009/7/29 Axe mailto:[email protected]>>


> > Any ideas on how to achieve this? I would like to let the
original sql
> > code stay original. I can prepare postgres before executing
the sql if
> > this makes it easier to acheive the goal
>
> Have a look at CREATE RULE.
>
> Tim
>
Ok, you mean I could create a rule for the table, then I let the
script go into my "black box",
do the original selects, but get a manipulated result set back, then I
drop the rule so
that the blackbox of scripts can get the not manipulated result set
back?

I need to sometimes get the result "output from query" and
sometimes not
and it would be awesomw to get it with the same query, but maybe by
setting
a rule or similar. Best from performance view would be to make
something
more permanent. Is it possible to activate/deactivate a rule?

Axe

--
Sent via pgsql-sql mailing list ([email protected]
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql




--
Mit freundlichen Grüßen

Dizdarevic Jasmin
Sonnenbergstr. 3
6714 Nüziders, AUT

[email protected] 
+43 664 411 79 29



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Foreign keys and fixed values...

2009-07-30 Thread Daryl Richter

Hi-

On Jul 29, 2009, at 4:16 PM, Kjell Rune Skaaraas wrote:



Hello,

I'm having a problem using foreign keys, or at least in a way i find  
good. Basicly I have a table I wish to use as a foreign key with  
lookup "id" and "code", like:


1,a
1,b
1,c
2,a
2,d

I have of course an UNIQUE on (id,code). However, in the table I  
want the key to apply to, I only have one column that I want to  
restrict to one lookup. Basicly I want to do something like:


ALTER TABLE data ADD FOREIGN KEY ( 2, col ) REFERENCES lookups ( id,  
code )


In this case 2 is a literal integer, not a column reference and the  
result should be that a and d is legal - but this command isn't legal.


Options:
1) Add another column to my data table like:
ALTER TABLE data ADD COLUMN dummy INTEGER DEFAULT 2
ALTER TABLE data ADD FOREIGN KEY ( dummy, col ) REFERENCES lookups  
( id, code )
This works, but creates a useless column in my data table that feels  
very unclean.


2) Refactor the lookup table to have one lookup per table. However,  
this also means I'll probably need to duplicate translation tables,  
value mapping tables etc. and a bunch of queries would have to  
dynamically alter table names. It seems more than a little messy.





#2 is the proper solution.

A foreign key is a binding constraint between all the rows in the 2  
tables.  You are essentially trying to create a FK that applies to  
some rows and not to others.  Or, alternately, a table which has  
conditional FKs to other tables based upon the data it contains.


So, you have created a dependency between the data in the database and  
your schema which is a bad idea.  Imagine, for example, after you have  
set this all up, the user comes to you and says, "Sorry, when I told  
you 2 maps to either a or d, I should have said 3".  Now, instead of  
just issuing an update, you have to change your schema.


Your other concerns regarding this option can almost certainly be  
addressed through proper schema design too.


--
Daryl
http://itsallsemantics.com

"Everyone thinks of changing the world, but no one thinks of changing  
himself."

- Leo Tolstoy



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] SQL report

2009-07-30 Thread Steve Crawford

[email protected] wrote:

I have the following senario.

I have a tracking system. The system will record the status of an 
object regularly, all the status records are stored in one table. And 
it will keep a history of maximum 1000 status record for each object 
it tracks. The maximum objects the system will track is 100,000. Which 
means I will potentially have a table size of 100 million records.


I have to generate a report on the latest status of all objects being 
tracked at a particular point in time, and also I have to allow user 
to sort and filter on different columes in the status record displayed 
in the report.

...


Just wanna to know if anyone have a different approach to my senario. 
Thanks alot.


Not knowing all the details of your system, here are some things you 
could experiment with:


1. Add a "latest record id" field in your object table (automatically 
updated with a trigger) that would allow you to do a simple join with 
the tracking table. I suspect that such a join will be far faster than 
calculating "max" 100,000 times at the expense of a slightly larger main 
table.


2. Add a "current record flag" in the status table that simply flags the 
most recent record for each object (again, use triggers to keep the flag 
appropriately updated). This would also eliminate the need for the "max" 
subquery. You could even create a partial index filtering on the 
"current record flag" which could speed things up if the reporting query 
is written correctly.


3. Partition the table into a "current status table" and "historical 
status table" (each inheriting from the main table). Use a trigger so 
that anytime a new status record in added, the old "current" record is 
moved from the "current" to the "historical" table and the new one added 
to the "current" table. The latest status report will only need a simple 
join on the "current" table with a max size of 100,000 rather than a 
more complex query over a 100,000,000 record table.


Cheers,
Steve


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Show CAS, USD first; the left ordered by currency name

2009-07-30 Thread Emi Lu

Good morning,

I have a currency table (code, description).

Example values:
 ADF | Andorran Franc
 ... ...
 ANG | NL Antillian Guilder
 AON | Angolan New Kwanza
 AUD | Australian Dollar
 AWG | Aruban Florin
 BBD | Barbados Dollar
 USD | US Dollar
 CAD | Canadian Dollar

Is there a way I can query to display USD AND CAD first, while other 
rows are ordered by Code.


For example,

CAS | Canadian Dollar
USD | US Dollar
ADF | Andorran Franc
...


Thanks a lot!
--
Lu Ying



--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Show CAS, USD first; the left ordered by currency name

2009-07-30 Thread Andreas Kretschmer
Emi Lu  wrote:

> Good morning,
>
> I have a currency table (code, description).
>
> Example values:
>  ADF | Andorran Franc
>  ... ...
>  ANG | NL Antillian Guilder
>  AON | Angolan New Kwanza
>  AUD | Australian Dollar
>  AWG | Aruban Florin
>  BBD | Barbados Dollar
>  USD | US Dollar
>  CAD | Canadian Dollar
>
> Is there a way I can query to display USD AND CAD first, while other  
> rows are ordered by Code.
>
> For example,
>
> CAS | Canadian Dollar
> USD | US Dollar
> ADF | Andorran Franc
> ...
>

Sure:

test=*# select * from currency order by case when code='USD' then 0 when
code = 'CAD' then 1 end, code;
 code | description
--+--
 USD  | US Dollar
 CAD  | Canadian Dollar
 ADF  | Andorran Franc
 ANG  | NL Antillian Guilder
 AON  | Angolan New Kwanza
 AUD  | Australian Dollar
 AWG  | Aruban Florin
 BBD  | Barbados Dollar




Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Show CAS, USD first; the left ordered by currency name

2009-07-30 Thread sergey kapustin
...order by currency not in('USD', 'AND', 'CAD');

this condition will be avaluated as FALSE for USD, AND and CAD, and as TRUE
for all other currencies. When the records are sorted the "false" are placed
on the top because false wrote:

> Good morning,
>
> I have a currency table (code, description).
>
> Example values:
>  ADF | Andorran Franc
>  ... ...
>  ANG | NL Antillian Guilder
>  AON | Angolan New Kwanza
>  AUD | Australian Dollar
>  AWG | Aruban Florin
>  BBD | Barbados Dollar
>  USD | US Dollar
>  CAD | Canadian Dollar
>
> Is there a way I can query to display USD AND CAD first, while other rows
> are ordered by Code.
>
> For example,
>
> CAS | Canadian Dollar
> USD | US Dollar
> ADF | Andorran Franc
> ...
>
>
> Thanks a lot!
> --
> Lu Ying
>
>
>
> --
> Sent via pgsql-sql mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>


Re: [SQL] SELECT max(time) group by problem

2009-07-30 Thread nha
Hello,

Le 30/07/09 11:38, Heigo Niilop a écrit :
> hi,
> 
> I have table
> 
> CREATE TABLE table
> (
>   id integer NOT NULL,
>   timest timestamp with time zone NOT NULL,
>   db_time timestamp with time zone NOT NULL DEFAULT now(),
>   "values" text[],
>   CONSTRAINT table_pkey PRIMARY KEY (id, timest)
> )
> [...] 
> SELECT MAX(table.timest)  FROM table, table1  WHERE
> table.id=table1.id and table1.id in (1,2,3) GROUP BY table.id
> 
> [...] it is terrible slow,
> when I use strange syntax
> 
> SELECT table.timest  FROM table,table1 WHERE
> table.id=table1.id and table1.id in(1,2,3) and table.timest=
> (SELECT max(timest) FROM table WHERE table.id=table1.id)  
> 
> I receive all needed data very fast.
> 
> My questions are
> 1)  why this first query is slow and what I can do to make it faster
> (some more indexes??)?
> 2)  what kind of danger I have with second query (so far I have
> right data)?   
> 
> I have Postgres 8.3 and table have over million rows. [...]

Q1) EXPLAIN ANALYZE output would likely give (at least partially) some
clarification about observed performance.

According to "table" definition, the implicit index created on
"table_pkey" would not be efficiently used in the 1st query because of
explicit aggregation on column "id" (ie. partial key of "table"). Full
scan of "table" is assumed for the join despite index scan on "table1"
with "id" index. Each "table1" row is then joined with million of rows
of "table" before matching WHERE clauses (as these latter apply for each
row resulting from join). Slowness is expected.

3 ideas (with or without combination) of improvement come to my mind at
this point:
ID1- Definition of explicit index on "table.id": this would lightly
quicken aggregation by "id" and join on "id";
ID2- Aggregation on "table1.id" instead of "table.id": because of
reference declaration of "table.id" on "table1.id", "table1.id" is
assumed to be a (primary) key of "table1" (and thence bound to an index,
speeding up aggregation);
ID3- Integration of WHERE clause "table1.id IN (1,2,3)" into a subquery
on "table1" (because this filter is independent from "table") and use of
this subquery instead of "table1" call for join. This would reduce the
size of the table to join to "table" and thence reduce the number of
join rows at a sooner stage of query execution.

A possible rewritten query would express as follows:
SELECT MAX(t.timest)
FROM table t
INNER JOIN (SELECT id FROM table1 WHERE id IN (1,2,3)) t1
ON t.id = t1.id
GROUP BY t1.id;
(In this case, indexing on "table1.id" is not necessary although
recommended.)

Inviting you to assess this proposal (or a derivative according to ideas
1, 2, and/or 3).

A 4th idea may consist in directly looking up "table.id IN (1,2,3)" in
case of certainty on these 3 values in column "table1.id". Such a way
strengthens the efficiency of an index to declare on "table.id" so that
lookup quickens.

The query would look like the following:
SELECT MAX(timest)
FROM table
WHERE id IN (1,2,3)
GROUP BY id;

Q2) On the other hand, the 2nd query seems to take advantage of "table"
index on its primay key (id,timest) as the 2 columns are explicitely
referred in WHERE clauses. The sub-select would be a bottleneck; but the
WHERE clause of this sub-select refers to a literal against column "id",
ie. hash join is assumed to be efficient and used in this case by the
database engine for a faster evaluation.

Eventually I do not guess what you have in mind by "danger" from this
2nd query. Maybe I missed some elements.

Regards.
--
nha / Lyon / France.

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql