date:20000831

Re: [SQL] Outer join statement ?

2000-08-31 Thread Alexandru COSTIN


Hello,
There are 2 ways of doing outer joins, both are relatively expensive
from the CPU time point of view
1. Create a function that returns the ojined value, or null if the value
does not exists in the secod table
2. Use a union beetwen a left join and a straight select with a where
condition. Next there's an example inspired from the Adison Westley book (I
suggest you grab a copy from the postgresql site. (/docs)

select customer.name,salesorder.order_id from customer join salesorder
on customer.cust_id=salesorder.fk_cust_id union all select
customer.name,null::int4 from customer where customer.cust_id not in (select
fk_cust_id from salesorder);

Probably an improvement here is to use exists on the second select, but
I didn't have the time to document.
Anyway, this will not work embedded in a view ("union all" is not
supported in views).

Alexander

"Nasdaq" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> The statement "Outer Join" is not available in Postgres ?
> Is there an extension provided by a third company that implement this
> functionnality ?
>
> If not do you know when this functionnality will be available in
> Postgres ?
>
> Thank
>
> Fred
>
> --
> TechBourse : le premier site en FRANCAIS dédié le NASDAQ
> http://www.techbourse.com

[SQL] optimal performance for inserts

2000-08-31 Thread Rini Dutta


Thanks for your suggestions, though I've already
considered most of them. (I have a detailed reply
below, interleaved with your mail). 

I am considering an option but would need help from
somebody who knows how the backend works to be able to
figure out if any of the following options would help.
Consider the scenario of a database with say 3 tables,
and atleast  3 concurrent writers to all the tables
inserting different records. Which of the three
options would be expected to perform better ? (I am
using JDBC, I dont know if that is relevant)

1. Having a different Connection per writer
2. Having a different Connection per table
3. Having a single Connection which performs the 3
transactions sequentially.

I was trying out some tests to decide between option 1
& option 2 , but did not get any conclusive results.

Would be helpful to get some suggestions on the same.

Thanks,
Rini

--- Mitch Vincent <[EMAIL PROTECTED]> wrote:
> Removing indexes will speed up the INSERT portion
> but slow down the SELECT
> portion.
I cannot remove indexes since there may be other
queries to these tables at the same time when I am
doing the inserts.

> Just an FYI, you can INSERT into table (select
> whatever from another
> table) -- you could probably do what you need in a
> single query (but would
> also probably still have the speed problem).
I have not spent time on it but I could not figure out
how to have an insert statement such that one of the
attributes (only) is a result of a select from another
table. I would be interested in knowing if there is a
way to do that.

> Have you EXPLAINed the SELECT query to see if index
> scans are being used
> where possible?
Yes, the index scans are being used

> -Mitch
> 
> - Original Message -
> From: "Rini Dutta" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Cc: <[EMAIL PROTECTED]>
> Sent: Friday, August 25, 2000 12:20 PM
> Subject: [SQL] queries and inserts
> 
> 
> > Hi,
> >
> > I am interested in how to speed up storage. About
> 1000
> > or more inserts may need to be performed at a time
> ,
> > and before each insert I need to look up its key
> from
> > the reference table. So each insert is actually a
> > query followed by an insert.
> >
> > The tables concerned are :
> > CREATE TABLE referencetable(idx serial, rcol1 int4
> NOT
> > NULL, rcol2 int4 NOT  NULL, rcol3 varchar(20) NOT
> > NULL, rcol4 varchar(20), PRIMARY KEY(idx) ...
> > CREATE INDEX index_referencetable on
> > referencetable(rcol1, rcol2, rcol3, rcol4);
> >
> > CREATE TABLE datatable ( ref_idx int4,
> > start_date_offset int4 NOT NULL, stop_date_offset
> int4
> > NOT NULL, dcol4 float NOT NULL, dcol5  float NOT
> NULL,
> > PRIMARY KEY(ref_idx, start_date_offset),
> CONSTRAINT c1
> > FOREIGN KEY(ref_idx) REFERENCES
> referencetable(idx) );
> >
> > I need to do the following sequence n number of
> times
> > -
> > 1. select idx (as key) from referencetable where
> > col1=c1 and col2=c2 and col3=c3 and col4=c4;
> (Would an
> > initial 'select into temptable'  help here since
> for a
> > large number of these queries 'c1' and 'c2'
> > comnbinations would remain constant ?)
> > 2. insert into datatable values(key, );
> >
> > I am using JDBC interface of postgresql-7.0.2 on
> > Linux. 'referencetable' has about 1000 records, it
> can
> > keep growing. 'datatable' has about 3 million
> records,
> > it would grow at a very fast rate. Storing 2000
> > records takes around 75 seconds after I vacuum
> > analyze. (before that it took around 40 seconds -
> ???)
> > . I am performing all the inserts ( including the
> > lookup) as one transaction.
> >
> > Thanks,
> > Rini
> >
> >
> > __
> > Do You Yahoo!?
> > Yahoo! Mail - Free email you can access from
> anywhere!
> > http://mail.yahoo.com/
> >
> 


__
Do You Yahoo!?
Yahoo! Mail - Free email you can access from anywhere!
http://mail.yahoo.com/

Re: [SQL] RE: Create table in functions

2000-08-31 Thread Jan Wieck

Andreas Tille wrote:
> On Thu, 24 Aug 2000, Hiroshi Inoue wrote:
>
> > Hmm,Andreas's original function seems to contain other statements.
> > If the function contains DML statements for the table Temp_Num_Table,
> > it wouldn't work properly. i.e 1st call would work but 2nd call woudn't.
> That's the problem.  I'm in the process of porting a set of about
> 30 Stroed Procedures from MS-SQL server to PostgreSQL and have
> just managed only 3 :-( because of some "This is not possible with
> PostgreSQL".

Could you try out the current CVS sources?

I've  added  an  EXECUTE  keyword  to PL/pgSQL earlier - er -
yesterday.  It should give you the power to at least port the
proc below to PostgreSQL.

> To make clear what I want to do, I just copy the original stored
> procedure, which might be not so clever so that it would be possible
> to go without the table creation:

Looks  like  the  only purpose of the (temp) table is to hold
exactly one single row of one attribute - the result  of  the
count(*).   From  the  code I guess that MS-SQL's only way to
build up a dynamic query is to put it  as  a  string  into  a
variable and call Exec(@var).  Maybe it's impossible to get a
return value out of that  at  the  same  time,  so  the  only
solution  is  to  throw  it  into a temp table and pick it up
again.

But anyway, the dynamic part of  building  the  qualification
(WHERE  clause)  was  the initial problem. And that should be
solved in CURRENT PostgreSQL sources (and  get  shipped  with
7.1).

Read on below...

>
>
> CREATE Procedure TilleA.pHelpCountNames
> ( @Arbeitsgruppeshort varchar(255) ,
>   @Condition  varchar(255)
> )
> /* Count names in table Mitarbeiter which belong to the group
>Arbeitsgruppeshort and match the condition @Condition
> */
> As
>   Declare @Query varchar(1024)
>   Declare @num int
>   Select  @num = 0
>
>   if @ArbeitsgruppeShort is NULL or @ArbeitsGruppeShort = '' begin
> Select @ArbeitsgruppeShort = ''
>   end else begin
> Select @ArbeitsgruppeShort = ' AND a.ArbeitsgruppeShort = ' +  + 
>@ArbeitsgruppeShort + 
>   end
>
>   Create Table #NumTable
>   (
>Num integer
>   )
>
>   Select @Query =
>  'Insert Into #NumTable (Num) ' +
>  'SELECT Count (*) ' +
>  'FROM Mitarbeiter m ' +
>  'INNER JOIN tm2nMitarbeiterArbeitsgruppe t ON m.IdMitarbeiter   = 
>t.IdMitarbeiter ' +
>  'INNER JOIN Arbeitsgruppe aON t.IdArbeitsgruppe = 
>a.IdArbeitsgruppe ' +
>  'WHERE ' + @Condition + @ArbeitsgruppeShort
>   Exec (@Query)
>
>   Select @num=(Select Max(Num) From #NumTable)
>
>   Drop Table #NumTable
>
>   return @num
>
>
> May be there are some clever ways to avoid the temporary table.
> I really wonder if my solution is in fact very clever because I'm unsure
> if it is thread-safe.
>
> Any hints?

Let me give it a try:

CREATE FUNCTION pHelpCountNames (text, text) RETURNS integer AS
  '
DECLARE
  ag_short  ALIAS FOR $1;
  condition ALIAS FOR $2;
  query text;
  rec   record;
  resultinteger;
BEGIN
  query := ''SELECT count(*) AS n FROM Mitarbeiter m,
tm2nMistarbeiterArbeitsgruppe t,
Arbeitsgruppe a
 WHERE m.IdMitarbeiter = t.IdMitarbeiter
   AND t.IdArbeitsgruppe = a.IdArbeitsgruppe'';

  IF condition !=  THEN
query := query || '' AND '' || condition;
  END IF;

  IF ag_short !=  THEN
query := query || '' AND a.ArbeitsgruppeShort = '' ||
 ag_short || ;
  END IF;

  FOR rec IN EXECUTE query LOOP
result = rec.n;
  END LOOP;

  RETURN result;
END;
  ' LANGUAGE 'plpgsql';

Just written down. Too lazy to test it right now.

BTW:   It   is  not  strictly  neccessary  to  build  up  the
querystring  for  EXECUTE  in  a  variable.  Any   expression
resulting  in a string (a function call or a concatenation of
strings, variables and functions) is allowed.

EXECUTE outside of a FOR statement simply executes the  query
and  that  should  not  be  a  SELECT.  This can be a utility
statement as well, so feel free to  create  and  drop  "temp"
tables as needed. Used as above (inside the FOR) it must be a
SELECT statement. I haven't checked if  rows  found,  because
that  count(*)  query is guaranteed to return either one row,
or bomb out with a transaction abort. Usually someone  should
check.

Enjoy.

Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being ri

Re: [SQL] Outer join statement ?

[SQL] optimal performance for inserts

Re: [SQL] RE: Create table in functions

3 matches

Site Navigation

Mail list logo

Footer information