Karen - Yes I did do the multi-column index (on both tables), dropped the index on the column being updated and changed the order of the tables in the UPDATE command.

But, this is embarrassing. The reason execution time was about 1 second is 'cuz I didn't have ERR MESSAGES on. My fat fingers had mistyped a column name and of all times to forgo R:Style.... "Aw heck, just a couple of column names, what can possibly go wrong?" The UPDATE command never executed.

So I got the column name fixed and was running some tests on three different arrangements of the UPDATE command. But here is a strange: Dennis's format runs in about 2 minutes with MANOPT ON and 5 seconds with MANOPT OFF. The disconcerting part is with MANOPT ON, 2674 rows get updated; with MANOPT OFF, 2711 rows get updated.

I can understand the execution times being different, but the number of rows updated??
I'll do some more testing tomorrow.
Doug

On 2/3/2015 5:41 PM, Karen Tellef wrote:
What else did you implement, Doug?  Did you do the multi-column index?

Karen



-----Original Message-----
From: Doug Hamilton <[email protected]>
To: RBASE-L Mailing List <[email protected]>
Sent: Tue, Feb 3, 2015 5:12 pm
Subject: [RBASE-L] - RE: WHERE Clause, indices and parens questions

Thanks Dennis!  That's exactly what I was looking for.
The column I'm updating _is_ indexed and I can understand why it would take long to update. I'll double check to see if the index is really needed. However, I did go ahead and implement your other suggestions and tested it on my system which is a recent copy of the actual db. Holy carp! Processing time went from about 30 seconds down to about 1 second!

I have one question on your point #2: I assume TxnHist should be listed first in the FROM clause because it is the table being updated and not because it is the larger (more rows) of the two tables.

Thanks again and best wishes for continued recovery from your auto accident.
Doug

On 2/3/2015 9:48 AM, Dennis McGrath wrote:
Doug,

First of all, is the column you are updating indexed?  That would slow updating 
it tremendously on a table this long.

If that is not the case I would do this:
1. Make a multi column index on your temp table for the 3 columns in the order 
that is used in your joining where clause.
2. Make the temp table the second table, not the first.
3. Set manopt on to make sure R:BASE follows your optimization.
4. Use this syntax (no parenthesis around the where clause):

UPDATE TxnHist +
     SET ChryInvNbr = INV.ChryInvNbr +
     FROM TxnHist TXN, ChryInvDtlTmp INV  +
   WHERE  +
     TXN.VPlNmbr = INV.VPlNmbr  AND +
     TXN.CusPnbr = INV.CusPnbr AND +
     TXN.TxDate = INV.InvoiceDate


This will avoid trying to use any of the single indexes in TxnHist, and use a 
very efficient multi-column index to get the update value from the temp table.

Further optimization can be done by changing the where clause (and temp index) 
clause so the most unique column is first.
I suspect InvoiceDate would be the most unique, but only you can answer that 
question.



Dennis McGrath


-----Original Message-----
From:[email protected]  [mailto:[email protected]] On Behalf Of Doug
Hamilton
Sent: Monday, February 02, 2015 8:15 PM
To: RBASE-L Mailing List
Subject: [RBASE-L] - WHERE Clause, indices and parens questions

Hi All -
I'm updating a column in a 1.2 megarow table (TxnHist) with values from
a column in a temp table (ChryPaid) that may have a few thousand rows,
based on three columns matching between the two tables.

I remember previous posts about use of parenthesis to force (or ignore?)
use of indexed columns and the best order in which to list the columns.
Unfortunately I can't find that post 'cause I'm asking those questions
again:
How should I use parenthesis to optimize the WHERE clause?
    Does the order of the columns matter? i.e. most unique listed first
or last?
    Are these even a concern if MANOPT is OFF?

The reason I'm asking is that it took about an hour run the following
command:

UPDATE TxnHist +
    SET ChryInvNbr = INV.ChryInvNbr +
    FROM ChryInvDtlTmp INV, TxnHist TXN  +
    WHERE  +
    (INV.VPlNmbr = TXN.VPlNmbr AND +
    INV.CusPnbr = TXN.CusPnbr AND +
    INV.InvoiceDate = TXN.TxDate)

Column CusPnbr in table TxnHist is an FK, the other five columns in the
WHERE clause are single-column indices.
The most unique column in TxnHist is VPlNmbr.

Settings in the RBASE.DAT file are:
SET STATICDB ON
SET FASTLOCK ON
SET QUALCOLS 2

MANOPT is not set in RBASE.DAT or in code, I believe, so it is OFF by
default.

Per Taskmanager, Network utilization is 17% on a 100 mbps connection
(high?).  CPU usage is about 10% to 15%.
This is on an XP machine with 2 Gb RAM.
R:Base is version 9.5.5.20116.

Thanks much for your help and my apologies for the repeat questions.
Doug

---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com




------------------------------------------------------------------------
<http://www.avast.com/>   
This email has been checked for viruses by Avast antivirus software.
www.avast.com <http://www.avast.com/>





---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

Reply via email to