Re: [PHP] Any bright solution for my problem?

Richard Lynch Sun, 26 Dec 2004 20:20:22 -0800

QT wrote:
> I have a database with thousands records. And every minutes I am selecting
> 50 rows from that database and send some data to another web url according
> sellected 50 rows.
>
> Whenever, my script finish job in one minute, I have no problem. Because
> after each transaction, I am uptading one of the rows field as a "uesd".


If you actually have MySQL transactions, you should be able to wrap the
entire process in a single transaction, and it should work -- assuming
MySQL transactions pass the whole ACID test...  You'll have to read the
MySQL manual and understand ACID, which I only grok as I read it, not at
all times, or I'd just tell you what you need to know.  Sorry.

> BUT! if my script can not finish sending data to another url due to low
> speed connection or such a problem. On the next minute, same script runs
> automaticlay and sellecting another 50 rows which are not marked by
> "used".
>
> In this case, some times these two running acripts are sellecting same
> rows
> due to, old one haven't marked as a "used" for sellected rows.
>
> Is there any way, to put flag or something like that, when sellecting rows
> with php and mysql?

Ah.

Given the problem you are facing, of a slow connection, you probably don't
want to tie up the whole table while running your script with a
transaction in the first place -- MySQL still does a table-level lock,
right?...  Maybe not.  Again, you'd have to read the MySQL docs carefully,
and be sure your version of MySQL matches, to see if they do row-level
locking.

Without row-level locking, your other scripts will wait for the slow
connection to release the whole table from its locked state.

That would probably be "bad" for most applications.

Here is what I would try if I were you:

//Possibly START a MySQL transaction HERE:
$query = "select table_id from your_table where not status = 'used' limit
50";
$selected = @mysql_query($query, $connection) or
trigger_error(@mysql_error($connection) . " $query", E_USER_ERROR);
$selected_ids = array();
while (list($id) = @mysql_fetch_row($selected)){
  $selected_ids[] = $id;
}
$selected_ids_sql = implode(', ', $selected_ids);
$query = "update your_table set status = 'used' where table_id in
($selected_ids_sql)";
$used = @mysql_query($query, $connection) or
trigger_error(@mysql_error($connection) . " $query", E_USER_ERROR);
//END the transaction started above here.

//*NOW* go ahead and get the stuff you want from your selected records
//and send them off to that slow connection:
$query = "select fields, you, really, want from your_table where table_id
in ($selected_ids_sql) ";
$data = @mysql_query($query, $connection) or
trigger_error(@mysql_error($connection) . " $query", E_USER_ERROR);
while (list($fields, $you, $really, $want) = @mysql_fetch_row($data)){
  //Send this row to the slow connection
}

By dealing with *ALL* the rows at once when you first SELECT them, and
then doing an UPDATE on *ALL* the rows at once right away, you can keep
the other scripts from getting confused about which records are 'used'.

If you wrap just that part of the code in a MySQL transaction, you make it
virtually 100% "safe" [*] for the other scripts.

Most importantly, by pulling the SELECT and UPDATE out, you can handle the
'used' attribute in a transaction *quickly* without bogging down your
application waiting for that slow connection or even just getting all the
data from MySQL to PHP.  You really only care about the ID field (assuming
you have one) and the 'used' status in terms of speed and in terms of
tracking which records were 'sent' probably.

This all assumes that you don't care if the slow connection gets its
records *AFTER some other fast connection gets the next 50 records...  If
that's not what your application needs, DON'T DO THIS. :-)

[*] NOTE:
The transaction is 100% safe, but if your slow connection dies, you've got
a bunch of records marked 'used' that were *NOT* really 'used' because the
script died after you marked them 'used'

Hopefully, that's a fairly rare occurence, right?

Since it is rare, if you can write your code to DETECT that the slow
connection died, and the records you tried to send didn't really get sent,
in that last loop, you can update a single record's 'used' setting to put
it back to 'unused' (or whatever you call it) if it fails to be sent.

So the basic algorithm is:

BEGIN TRANSACTION
  SELECT 50 ids
  UPDATE those 50 ids as 'used' in ONE query
END TRANSACTION
[the above should happen about as fast as PHP/MySQL can go]
SELECT actual data for selected 50 ids
send data to client
if a record fails to send, re-set status to not be 'used' so it can get
sent in a later batch.

The down-side is that faster connections will get records "out of
sequence" while the slower connection is still chugging away trying to
process the records that are "earlier" in your sequence.  In most
applications of this nature, that's okay.

If that's not okay, this is entirely the wrong algorithm to use, and you
need to wrap the whole thing in a TRANSACTION and accept the fact that
your slow connection is going to lock up your entire database until it
finishes all 50 records...  At which point you can try some combination of
things like:
  Send less than 50 records, at least to connections that have been proven
to be "slow" in the past.
  Buy more hardware or acceleration software to make up for the slowness
of some connections by making all the rest go faster, and hope it
averages out.
  Build in a system of detecting slow connections and roll back your
transactions when something is too slow, and hope the next time it's
faster.

What you do among all these choices depends more on your application's
needs than on what's "fastest" or "best" really.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] Any bright solution for my problem?

Reply via email to