I frequently use DISTINCT in subselects to speed up processing

SELECT ... WHERE colname IN ( SELECT DISTINCT colname FROM .....)

Try your queries with and without the DISTINCT.


Updates benefit even more, often reducing the time drasticly.
UPDATE ... WHERE colname IN ( SELECT DISTINCT colname FROM .....)

Dennis McGrath


--- Jim Limburg <[EMAIL PROTECTED]> wrote:
> Thank You Mr R:>
> 
> This is a real good lesson. I see how this would be more effective.
> Now if I can get some of our db's more normalized and
> setup to utiilize this it will be a great benefit.
> 
> Again. Thank You
> Jim Limburg
> 
> A. Razzak Memon wrote:
> 
> > At 01:17 PM 3/3/2004 -0500, Jim Limburg wrote:
> >
> >> I have a question about this that you wrote:
> >>
> >> 08. Use correlated sub-selects or multi-table selects instead
> >> --  of sub-selects
> >>
> >> A multi-table select uses indexes to join the tables. A correlated
> >> sub-select uses indexes to restrict the list of values to be
> >> compared within the sub-select. A plain sub-select will not use
> >> indexes. Each row of data from the main query must be compared to
> >> every row selected by the sub-select.
> >>
> >> My question is basically along the lines of not understanding the
> >> above comment. Please remember those of us more SQL challenged
> >> individuals. When you say plain sub-select, do you mean something
> >> like this SELECT * FROM ITEMS WHERE custnum IN (SELECT custnum
> from
> >> SPECIALORDERS WHERE custnum contains '1234').
> >> Now, does a corrlated sub-select use the T1, T2 type interface?
> >>
> >> Just wanting to learn.
> >
> >
> >
> > Jim,
> >
> > To illustrate, compare the results obtained by using the different
> > techniques on two tables, each with 1,000 unique rows (each row in
> > Table1 has only one match in Table2).
> >
> > (1) A multi-table select, shown below, took 00:05.
> >
> >     SELECT collist FROM table1,table2 +
> >     WHERE Table1.LinkCol = Table2.LinkCol
> >
> > (2) A correlated sub-select, which looked at every row in Table1
> >     but needed to look at only one row in Table2 for each row in
> >     Table1, takes less than 00:10. The example is shown below:
> >
> >     SELECT collist FROM Table1 WHERE LinkCol IN +
> >     (SELECT LinkCol FROM Table2 WHERE
> Table2.LinkCol=Table1.LinkCol)
> >
> > (3) A plain sub-select, which had to look for a match in every row
> >     in Table2 for each of the 1,000 rows of Table1, will take more
> >     than 00:10. (three times, maybe..) to complete. This example,
> >     by far the slowest, is shown below:
> >
> >     SELECT collist FROM Table1 WHERE LinkCol IN +
> >     (SELECT LinkCol FROM Table2)
> >
> > SELECT is a command in which your data really affects the
> performance.
> > The results you see and the method you find best will depend on the
> > number of rows and distribution of data in the tables you are
> combining.
> >
> > You should try those techniques in your real time database and see
> what
> > you find.
> >
> > Hope that helps!
> >
> > Very Best R:egards,
> >
> > Razzak.
> >
> >
> > A. Razzak Memon wrote:
> >
> >>> At 03:48 PM 3/2/2004 -0800, Victor Timmons wrote:
> >>>
> >>>> I would like some tips on how to speedup rbase when it
> >>>> running on a network.
> >>>
> >>>
> >>> Victor,
> >>>
> >>> Here are a few more tips in addition to the suggestions
> >>> posted earlier by fellow R:BASE users/developers:
> >>>
> >>> -- 
> >>> 01. Use the most current version of R:BASE
> >>> --  (I know you are using the latest & greatest version!)
> >>>
> >>> R:BASE Technologies is continually making speed improvements,
> >>> and each new release is faster than the previous one. For
> >>> example, building indexes in R:BASE 6.5/7.x is significantly
> >>> faster than in previous versions.
> >>>
> >>> You can even PACK the INDEXes faster in 6.5++, even in a
> >>> Multi-User environment.
> >>>
> >>>         PACK INDEX
> >>>         or
> >>>         PACK INDEX FOR TableName
> >>>
> >>> -- 
> >>> 02. Think globally instead of row by row
> >>> -- 
> >>> This is one of the best ways to achieve some significant speed
> >>> improvements. Instead of doing row-by-row processing using a
> >>> DECLARE CURSOR, try using a single UPDATE or INSERT command,
> >>> which will take advantage of the SQL select capabilities of
> >>> R:BASE.
> >>>
> >>> For example, suppose you want to create a table to hold
> >>> information about how many of each product a customer has
> >>> purchased. The most logical approach is to set up a cursor
> >>> to go through the detail table and count the rows for each
> >>> customer/product combination and then insert that data into
> >>> the new table. On a table with 3500 rows, processing all
> >>> the rows may take few minutes. By replacing the DECLARE
> >>> CURSOR loop with a single insert command, the time required
> >>> to process all the rows may take 75% less.
> >>>
> >>> Compare the two pieces of code:
> >>>
> >>> DECLARE c1 CURSOR FOR SELECT CustID,PartNo FROM CPProdDet +
> >>> GROUP BY CustID,PartNo
> >>>  OPEN c1
> >>>    FETCH c1 INTO +
> >>>    vCustID INDIC ivCustID, +
> >>>    vPartNo INDIC ivPartNo
> >>>      WHILE SQLCODE < > 100 THEN
> >>>        SELECT COUNT(*) INTO vCount FROM CPProdDet WHERE +
> >>>         CustID = .vCustID AND PartNo =.vPartNo
> >>>        INSERT INTO CustProd (CustID,PartNo,Copies) VALUES +
> >>>        (.vCustID,.vPartNo,.vCount)
> >>>        FETCH c1 INTO +
> >>>        vCustID INDIC ivCustID, +
> >>>        vPartNo INDIC ivPartNo
> >>>      ENDWHILE
> >>>      DROP CURSOR c1
> >>>
> >>>      vs.
> >>>
> >>>      INSERT INTO CustProd (CustID,PartNo,Copies) +
> >>>      SELECT CustID,PartNo,Count(*) +
> >>>      FROM CPProdDet GROUP BY CustID,PartNo
> >>>
> >>> -- 
> >>> 03. Be Creative
> >>> -- 
> >>> R:BASE almost always offers two or more ways of doing something.
> >>> Look for an alternative method for accomplishing a task - it
> >>> might just be considerably faster.
> >>>
> >>> -- 
> >>> 04. Change the environment
> >>> -- 
> >>> The R:BASE environment settings that have the most effect on
> >>> performance are SET MESSAGES.
> >>>
> >>> SET MESSAGES OFF eliminates the time required for screen display
> >>> of messages. This is particularly noticeable if you are doing
> >>> repetitive UPDATE or INSERT commands.
> >>>
> >>> -- 
> >>> 05. Reduce Number of Expressions in Reports
> >>> -- 
> >>> Because all report expressions are processed for each row of
> >>> data read from the reports driving table or view, reducing the
> >>> number of expressions in a report increases the printing speed
> >>> of a report.
> >>>
> >>> -- 
> >>> 06. Abbreviate commands to four characters
> >>> -- 
> >>> R:BASE parses command lines into 4-byte tokens - the fewer
> >>> tokens to read, the faster the command executes.
> >>>
> >>> -- 
> >>> 07. Combine commands whenever possible
> >>> -- 
> >>> Many R:BASE commands allow access to more than one column or
> >>> variable and don't require using separate commands for each.
> >>> This reduces the number of times R:BASE must parse and read a
> >>> command. The fewer commands to read, the faster R:BASE executes.
> >>>
> >>> -- 
> >>> 08. Use correlated sub-selects or multi-table selects instead
> >>> --  of sub-selects
> >>>
> >>> A multi-table select uses indexes to join the tables. A
> correlated
> >>> sub-select uses indexes to restrict the list of values to be
> >>> compared within the sub-select. A plain sub-select will not use
> >>> indexes. Each row of data from the main query must be compared to
> 
=== message truncated ===

Reply via email to