Hi Wol,
The scenario is that-
We have a master file having more than 3-4 million records and have corresponding
reference file which contains reference data for this master file.
Now we start our purge program which selects records from master file on the basis of
date. Corresponding data should be deleted from the other file (reference file).
For this requirement we have adopted the approach that- select the record from master
file on the basis of date. Save the list of these records, then on the basis of this
list select the records from reference file.
Issue is that this list contains more than 0.5 million and I want to take few (say
10,000 at a time) record ids from this list for further processing.
Any pointers for this problem will be very helpful.
Thanks in advance.
Ashish.
-Original Message-
From: Anthony Youngman [mailto:[EMAIL PROTECTED]
Sent: Monday, March 15, 2004 4:50 PM
To: ashish ratna
Subject: RE: Help Needed regarding performance improvement of delete
query
Ahhh
I thought you were selecting records and deleting them. So the first
thousand would have disappeared, and you would obviously get a different
thousand next time round because the first lot would have gone :-)
Sounds like that's not the case.
In that case, do you have a field that is repeated across many records,
but where each individual value (or range of values) wouldn't return too
big a bunch of records? Or do you have a numeric id - you could declare
an index on that ...
Let's take that numeric id idea - and then you'll have to build on it
for yourself. Declare an i-descriptor as @ID[3]. That logically
partitions your file into a thousand pieces. Declare an index on this.
The first term in your select will then be WITH IDESC EQ
whatever-number-you-want. That'll reduce the load on the database for
each pass, you'll just need to wrap it in a loop where whatever-number
goes from 0 to 999
Actually, what I would probably do is declare my i-descriptor as
INT(@ID/30) and then run this purge daily with whatever-number as
today's day. Obviously, it'll do nothing on the 31sts, and in March
it'll do two month's work on the 29th and 30th, but you've reduced the
hit on the system considerably.
Without knowing what you're doing in more detail, it's difficult to give
you any proper advice, but certainly I'll try and think of any tips that
you can build upon, like this. But you need to work out what's right for
you :-)
NB Any reason for taking this off the mailing list? By all means cc it
to me, but if you keep it on the list there are other people who may be
able to help too - I'm very good at overviews, but I fall short on the
logic - I think there is a way to get the next thousand records, but I
haven't got a clue what the syntax is ...
Cheers,
Wol
-Original Message-
From: ashish ratna [mailto:[EMAIL PROTECTED]
Sent: 15 March 2004 10:33
To: Anthony Youngman
Subject: RE: Help Needed regarding performance improvement of delete
query
Hi,
Thanks for the nice suggestions.
I have another question that, using SAMPLE once I process 1000 records
(using SAMPLE 1000), how can I select next 1000 records in next run
(i.e. 1001 to 2000 records and so on)?
I was trying few combinations but didn't succeeded. Can you tell me the
syntax for that?
Thanks again.
Regards,
Ashish.
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Behalf Of Anthony Youngman
Sent: Monday, March 15, 2004 2:32 PM
To: U2 Users Discussion List
Subject: RE: Help Needed regarding performance improvement of delete
query
This might help speed things up a bit ...
Firstly, of course, is your file properly sized?
Secondly, (and in this case you will need to run the SELECT / DELETE
sequence several times) try putting a SAMPLE 1000 (or whatever number
makes sense) at the end of your select.
Basically, this will mean that the SELECT runs until it finds that
number of records and then stops. So each sequence won't load the system
so badly. Creating a huge select list will stress your ram badly ...
looping through this sequence won't stress the system so badly, though
you really do need to use indices to reduce the stress even more ...
Create an index on various fields that you're using as your select
criteria. If you're selecting old records, then you need to select on
date, and this really will make life both easy and fast. The more
closely you can guarantee that a select, acting on a single index, will
pick up only or mostly records that you are going to delete, the better.
That will SERIOUSLY reduce the time taken and the performance hit.
Cheers,
Wol
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of ashish ratna
Sent: 15 March 2004 08:25
To: [EMAIL PROTECTED]
Subject: Help Needed regarding performance improvement of delete query
Hi All,
We are working for purging of old data from the database. But we are
facing performance problems in this.
We are using select