jos houtman wrote:
First of all, Thank you for the many reply's and interesting discussions.

Let me tell you what we concluded, after some tests it was obvious that 10k files per directory was far better then the 50k we use now. The longest rsync on 10K files took 15 seconds, avarage took about 8 seconds. That is with the 9TB system using jfs and the 4TB using reiserfs.
We intend to use rsync in a combination with marking the folders dirty.

This method should scale well enough for us, figures indicate we might have 100TB by the end of the year.

I believe there is no ultimate solution for a company like ours, we are constantly trying to find better solutions, Some new website feature requires other hardware setups for optimality, Bottle necks are common. therefor what would now be the ultimate solution, might not be so in a few weeks. But we will keep looking to better solutions for storage, backup and all other areas. But we will have to do it as problems arise, resources are spread a little thin.
The Just in Time concept has penetrated to systemmanagement.

There are a few things you can try to make what you've got faster or at least get them into your plan for the future. 1. Smaller drives have better seek time. Basically the whole more spindles per data thing. Dealt with a very large mail system in '01 and the change from 36GB drives to 72GB drives decreased I/O throughput enough where we had to swap back to the smaller drives. 500GB SATA drives look great on paper, but 300GB drives might perform better. 2. Cache more at your web layer and keep I/O off your storage. Run all webservers with the most RAM you an afford, if it's in the local cache it's not a storage hit, reverse proxy squid your webservers and set a local Squid cache to serve files directly from a purpose built proxy with fast local disk, a dedicated cache layer doing the same thing that you might redirect to, a media cluster that doesn't have the overhead that comes with running PHP, Perl, whatever on the main site, lots of interesting things here, well that just makes the dite faster. 3. If 5% of your content is 90% of your bandwidth then a content delivery system makes sense. However uploading a data set in the TB range is not cost effective. 4. Smaller disk groups on your storage. An EMC engineer explained this one to me. Say you've got sixteen drives in your array. Rather than a single RAID 5 set, you make three sets of RAID 5 with a floating hot spare. Each set has it's own data so when you look for fileA you hit drives 1-5 rather than all fifteen. The smaller data set means you get less violent random requests across the cluster, each drive is more likely to have a cache hit since you aren't support the whole data set, and so on. 5. Rumor is that iSCSI is faster and has less overhead. You might want to test both NFS and iSCSI. Also don't believe any of the nonsense about needing TOE cards or dedicated HBA cards for either. Just be able to dedicate an ether interface to storage. 6. Jumbo Frames. Assuming part of your problem is NFS data ops switching to jumbo frames would increase packet sizes from 1500 bytes to 9000 bytes and cut your data ops. I just about doubled throughput by using jumbo packets with iSCSI back video streaming service. However this only works if you have a dedicated storage LAN and set all servers, clients, and switch ports to use jumbo frames, MTU 9000. Using jumbo frames on the "going out to the Internet side" is usually problematic. Also some switches don't support jumbo frames.

7. Graph the hell out of everything. MRTG, Cacti, Excel, whatever. I can not stress this one enough. It's saved my ass a number of times over the past ten years. Having graphs of load, RAM usage, storage, local I/O, network I/O; Mysql queries, scans, table locks, cache hits, full table scans, etc; NFS ata ops, Apache processes, etc makes troubleshooting a million times easier. And it's great for getting more money out of management when you can prove the storage is doing twice the work it was doing three months ago.

kashani
--
[email protected] mailing list

Reply via email to