Folks,

It's a real nightmare when you have to purge +2gig files, eh? They can run for weeks 
and have your users complaining that entire week too. Nag.. nag..nag...

Simple solution:

Backup entire data file to media. Select what you want to KEEP and move those records 
to a temp file. Clearfile in a basic program or delete old data file. Rename temp file 
or copy records back to original data file.

If you have a exclusive access window, this is a piece of cake; otherwise, you have to 
become more creative and carve out your own customized exclusive access window on the 
fly and use a whole host of mirrors hoping the phone doesn't ring with your 
fingerprints all over a potiential mess --- just in case you get distracted and 
mis-shuffle the deck :p)

Bram
  ----- Original Message ----- 
  From: ashish ratna 
  To: Anthony Youngman 
  Cc: [EMAIL PROTECTED] 
  Sent: Monday, March 15, 2004 7:55 AM
  Subject: RE: Help Needed regarding performance improvement of delete query


  Hi Wol,

  The scenario is that-

  We have a master file having more than 3-4 million records and have corresponding 
reference file which contains reference data for this master file.

  Now we start our purge program which selects records from master file on the basis 
of date. Corresponding data should be deleted from the other file (reference file).

  For this requirement we have adopted the approach that- select the record from 
master file on the basis of date. Save the list of these records, then on the basis of 
this list select the records from reference file.

  Issue is that this list contains more than 0.5 million and I want to take few (say 
10,000 at a time) record ids from this list for further processing.

  Any pointers for this problem will be very helpful.

  Thanks in advance.

  Ashish.



  -----Original Message-----
  From: Anthony Youngman [mailto:[EMAIL PROTECTED]
  Sent: Monday, March 15, 2004 4:50 PM
  To: ashish ratna
  Subject: RE: Help Needed regarding performance improvement of delete
  query


  Ahhh

  I thought you were selecting records and deleting them. So the first
  thousand would have disappeared, and you would obviously get a different
  thousand next time round because the first lot would have gone :-)
  Sounds like that's not the case.

  In that case, do you have a field that is repeated across many records,
  but where each individual value (or range of values) wouldn't return too
  big a bunch of records? Or do you have a numeric id - you could declare
  an index on that ...

  Let's take that numeric id idea - and then you'll have to build on it
  for yourself. Declare an i-descriptor as @ID[3]. That logically
  partitions your file into a thousand pieces. Declare an index on this.
  The first term in your select will then be "WITH IDESC EQ
  whatever-number-you-want". That'll reduce the load on the database for
  each pass, you'll just need to wrap it in a loop where "whatever-number"
  goes from 0 to 999

  Actually, what I would probably do is declare my i-descriptor as
  INT(@ID/30) and then run this purge daily with "whatever-number" as
  today's day. Obviously, it'll do nothing on the 31sts, and in March
  it'll do two month's work on the 29th and 30th, but you've reduced the
  "hit" on the system considerably.

  Without knowing what you're doing in more detail, it's difficult to give
  you any proper advice, but certainly I'll try and think of any tips that
  you can build upon, like this. But you need to work out what's right for
  you :-)

  NB Any reason for taking this off the mailing list? By all means cc it
  to me, but if you keep it on the list there are other people who may be
  able to help too - I'm very good at overviews, but I fall short on the
  logic - I think there is a way to get the next thousand records, but I
  haven't got a clue what the syntax is ...

  Cheers,
  Wol

  -----Original Message-----
  From: ashish ratna [mailto:[EMAIL PROTECTED] 
  Sent: 15 March 2004 10:33
  To: Anthony Youngman
  Subject: RE: Help Needed regarding performance improvement of delete
  query

  Hi,

  Thanks for the nice suggestions. 
  I have another question that, using SAMPLE once I process 1000 records
  (using SAMPLE 1000), how can I select next 1000 records in next run
  (i.e. 1001 to 2000 records and so on)?

  I was trying few combinations but didn't succeeded. Can you tell me the
  syntax for that?

  Thanks again.

  Regards,
  Ashish.



  -----Original Message-----
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  Behalf Of Anthony Youngman
  Sent: Monday, March 15, 2004 2:32 PM
  To: U2 Users Discussion List
  Subject: RE: Help Needed regarding performance improvement of delete
  query


  This might help speed things up a bit ...

  Firstly, of course, is your file properly sized?

  Secondly, (and in this case you will need to run the SELECT / DELETE
  sequence several times) try putting a SAMPLE 1000 (or whatever number
  makes sense) at the end of your select.

  Basically, this will mean that the SELECT runs until it finds that
  number of records and then stops. So each sequence won't load the system
  so badly. Creating a huge select list will stress your ram badly ...
  looping through this sequence won't stress the system so badly, though
  you really do need to use indices to reduce the stress even more ...

  Create an index on various fields that you're using as your select
  criteria. If you're selecting old records, then you need to select on
  date, and this really will make life both easy and fast. The more
  closely you can guarantee that a select, acting on a single index, will
  pick up only or mostly records that you are going to delete, the better.
  That will SERIOUSLY reduce the time taken and the performance hit.

  Cheers,
  Wol

  -----Original Message-----
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  On Behalf Of ashish ratna
  Sent: 15 March 2004 08:25
  To: [EMAIL PROTECTED]
  Subject: Help Needed regarding performance improvement of delete query

  Hi All,
   
  We are working for purging of old data from the database. But we are
  facing performance problems in this.
   
  We are using select query which is created dynamically on the basis of
  number of records. We want to know if there is any limit for size of
  query in Universe.
   
  Although in universe help pdf it is mentioned that there is no limit for
  the length of select query. But when we run the program on the file with
  records more than 0.5 million it gave the error-
   
  "Pid 14433 received a SIGSEGV for stack growth failure. Possible causes:
  insufficient memory or swap space, or stack size exceeded maxssiz. 

  Memory fault(coredump)"

  If there is no limitation on the size of query then please suggest some
  other possible solution which can help us reducing the time of query and
  completing the process successfully without giving the error.

  Thanks in advance.

  Regards,

  Ashish.





  ************************************************************************
  ***********

  This transmission is intended for the named recipient only. It may
  contain private and confidential information. If this has come to you in
  error you must not act on anything disclosed in it, nor must you copy
  it, modify it, disseminate it in any way, or show it to anyone. Please
  e-mail the sender to inform us of the transmission error or telephone
  ECA International immediately and delete the e-mail from your
  information system.

  Telephone numbers for ECA International offices are: Sydney +61 (0)2
  9911 7799, Hong Kong + 852 2121 2388, London +44 (0)20 7351 5000 and New
  York +1 212 582 2333.

  ************************************************************************
  ***********

  -- 
  u2-users mailing list
  [EMAIL PROTECTED]
  http://www.oliver.com/mailman/listinfo/u2-users


  -- 
  u2-users mailing list
  [EMAIL PROTECTED]
  http://www.oliver.com/mailman/listinfo/u2-users
--
u2-users mailing list
[EMAIL PROTECTED]
http://www.oliver.com/mailman/listinfo/u2-users

Reply via email to