Ken Egervari wrote:
I thought about this, but it's very important since shipment and shipment_status are both updated in real time 24/7/365. I think I might be able to cache it within the application for 60 seconds at most, but it would make little difference since people tend to refresh within that time anyway. It's very important that real-time inforamtion exists though.
Is 60s real-time enough for you? That's what it sounds like. It would be nice if you could have 1hr, but there's still a lot of extra work you can do in 60s.
You could also always throw more hardware at it. :) If the shipment_status is one of the bottlenecks, create a 4-disk raid10 and move the table over. I don't remember what your hardware is, but I don't remember it being a quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction log on a solid state disk. :)
That sounds like an awesome system. I loved to have something like that. Unfortunately, the production server is just a single processor machine with 1 GB ram. I think throwing more disks at it is probably the best bet, moving the shipment and shipment_status tables over as you suggested. That's great advice.
Well, disk I/O is one side, but probably sticking another 1GB (2GB total) also would be a fairly economical upgrade for performance.
You are looking for query performance, not really update performance, right? So buy a 4-port SATA controller, and some WD Raptor 10k SATA disks. With this you can create a RAID10 for < $2k (probably like $1k).
30ms is a good target, although I guess I was naive for setting that goal perhaps. I've just taken queries that ran at 600ms and with 1 or 2 indexes, they went down to 15ms.
It all depends on your query. If you have a giant table (1M rows), and you are doing a seqscan for only 5 rows, then adding an index will give you enormous productivity gains. But you are getting 30k rows, and combining them with 6k rows, plus a bunch of other stuff. I think we've tuned the query about as far as we can.
Let's say we have 200 users signed into the application at the same time. The application refreshes their shipment information automatically to make sure it's up to date on the user's screen. The application will execute the query we are trying to tune every 60 seconds for most of these users. Users can set the refresh time to be higher, but 60 is the lowest amount so I'm just assuming everyone has it at 60.
Anyway, if you have 200 users logged in, that's 200 queries in the 60 second period, which is about 3-4 queries every second. As you can see, it's getting maxed out, and because of bad luck, the queries are bunched together and are being called at the same time, making 8-9 queries in the same second and that's where the performance is starting to degrade. I just know that if I could get this down to 30 ms, or even 100, we'd be okay for a few months without throwing hardware at the problem. Also keep in mind that other application logic and Hibernate mapping is occuring to, so 3-4 queries a second is already no good when everything is running on a single machine.
The other query I just sent, where you do the query for all users at once, and then cache the result, *might* be cheaper than doing a bunch of different queries. However, you may find that doing the query for *all* users takes to long. So you could keep another table indicating who the most recent people logged in are, and then only cache the info for those people. This does start getting a little more involved, so see if you can do all users before heading down this road.
This isn't the best setup, but it's the best we can afford. We are just a new startup company. Cheaper servers and open source keep our costs low. But money is starting to come in after 10 months of hard work, so we'll be able to replace our server within the next 2 months. It'll be a neccessity because we are signing on some big clientsnow and they'll have 40 or 50 users for a single company. If they are all logged in at the same time, that's a lot of queries.
Sure. Just realize you can't really support 200 concurrent connections with a single P4 and 1GB of ram.
Description: OpenPGP digital signature