Patrick Hatcher wrote:

Scenario: Each day I update a column in a table with an internal percentile value. To arrive at this value, I'll get a count of records with sales > 0 and then divide this count by the total number of tile groups I want. So for example: Total records w/sales > 0 = 730,000 tile# = 100 total percentile groups (730,000 / 100):7300

Now that I have the total number of groups I need, I cycle through my
recordset, grab the key field and the current percentile number and stuff
the values into a temp table. I mod the current row counter against the
total percentile group number.  If it is 0 then I add 1 to the  current
percentile number .  After inserting records into the temp file I then
update the main table.

Using the example above, the first 7300 records would get a  percentile
number of 1, the next 7300  records would get  a percentile number of 2,
then 3,4,5,etc.

Unfortunately, I am going record by record in a loop and the process takes
upwards of 20mins.  Is there a faster way to do this?  I thought about
using limit and offset, but I'm not sure how I would accomplish it.

Idea #1:
Well, if you ordered by something unique, you could use ORDER BY, LIMIT and OFFSET


UPDATE my_table SET my_group=1 WHERE val > 0 ORDER BY val, my_id_column OFFSET 0 LIMIT 7300;
UPDATE my_table SET my_group=2 WHERE val > 0 ORDER BY val, my_id_column OFFSET 7300 LIMIT 7300;
etc.


You'll need to make sure the order is guaranteed for all rows when "val" is the same though.

Now, the problem with this is that by the time you get to group 99, you've had to scan over group 1 98 times - not good. If you knew how many rows there were for any given "val" then you could restrict it much more though.

Idea #2:
Turn your function around. Declare a cursor on the sorted SELECT of target-rows. SKIP 7300 rows and read the primary-key. Use that to issue an update of all rows between two values. That's only 100 updates issued rather than one for each target row.


Idea #3:
Someone might well be able to come up with a clever idea involving a join against a set-returning function, but I'm not sure about guaranteeing the order of the join vs the returned set (and it's getting late here). Any ideas people?


Maybe one of those is some use
--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Reply via email to