> This actually is a slow down, You should store each tweet as a row in > the table > and not 10 tweets as 1 serialized row. You not only can't search well, > but updates, > and retrieval will be hindered.
Yes, that is definitely true, however for certain circumstances (as with data that you will never need to search through) it can be very useful in saving space and rows. For most of the extra data (timezone, etc) we store in arrays because I do not ever envision looking up timezones. The only problem with arrays is sorting easily. You can do sort() or asort() or similar commands, but at first they are difficult to master. > PDO will use the mysql-nd driver if you have it installed, which > allows for persistent connections. That might offa speed boost as well. Yes, this might be true (I have seen stats both ways), however with majority of our scripts, they terminate before or at 60 secs (because of refresh rate of data mine API), so having persistent connections is not worth it yet. I will review more comparisons and reply with my results. > I'm not sure PHP strtolower issafe for multi-byte UTF-8. My PHP is > very rusty but I think the (slower) mb_strtolower is multi-byte character > safe. I don't think this affects the code you're using, but it's always > worth mentioning. Very good point Matt, yes mb_strtolower() (that is the correct function) would be better, however I would like to point out that it can be almost 50% slower in most cases. This would not work for our setup as we need to process tweets in less than .1 secs. Also, I understand that Twitter would like to be as diverse as possible, however I have not setup any of my Regular Expressions for non-UTF-8 chars so be running this I'm actually helping out the rest of the script. Thanks for all the input, keep it coming!
