> This actually is a slow down, You should store each tweet as a row in
> the table
> and not 10 tweets as 1 serialized row. You not only can't search well,
> but updates,
> and retrieval will be hindered.

Yes, that is definitely true, however for certain circumstances (as
with data that you will never need to search through) it can be very
useful in saving space and rows.
For most of the extra data (timezone, etc) we store in arrays because
I do not ever envision looking up timezones.  The only problem with
arrays is sorting easily. You can do sort() or asort() or similar
commands, but at first they are difficult to master.


> PDO will use the mysql-nd driver if you have it installed, which
> allows for persistent connections. That might offa speed boost as well.

Yes, this might be true (I have seen stats both ways), however with
majority of our scripts, they terminate before or at 60 secs (because
of refresh rate of data mine API), so having persistent connections is
not worth it yet. I will review more comparisons and reply with my
results.

> I'm not sure PHP strtolower issafe for multi-byte UTF-8. My PHP is
> very rusty but I think the (slower) mb_strtolower is multi-byte character
> safe. I don't think this affects the code you're using, but it's always
> worth mentioning.

Very good point Matt, yes mb_strtolower() (that is the correct
function) would be better, however I would like to point out that it
can be almost 50% slower in most cases. This would not work for our
setup as we need to process tweets in less than .1 secs. Also, I
understand that Twitter would like to be as diverse as possible,
however I have not setup any of my Regular Expressions for non-UTF-8
chars so be running this I'm actually helping out the rest of the
script.

Thanks for all the input, keep it coming!

Reply via email to