Hi All, Thanks for the responces, and I do concur. I was taking a stab in the dark so to speak.
We are working with our hosting providers currently and will be introducing a multitude of small iSCSI SANs to split the storage structure over a multitude of disks... This is something that needs to be addressed from a systems perspective rather than an architectural one. SSD (or Fusion and the like) are unfortunately still way to expensive for the capacity that we require (good couple of TBs) - so mechanical disks it would need to be. However, with the use of SANs as we hope, we should be able to go up from 4 to over 64 spindles whilst still being able to share the storage and have redundancy. Many thanks for the inputs and feedbacks... -- C On Fri, Jul 26, 2013 at 9:23 AM, Johan De Meersman <vegiv...@tuxera.be> wrote: > Hey Chris, > > I'm afraid that this is not what databases are for, and the first thing > you'll likely run into is amount of concurrent connections. > > This is typically something you should really tackle from a systems > perspective. Seek times are dramatically improved on SSD or similar storage - > think FusionIO cards, but there's also a couple of vendors (Violin comes to > mind) who provide full-blown SSD SANs. > > If you prefer staying with spinning disks, you could still improve the seeks > by focusing on the inner cylinders and potentially by using variable sector > formatting. Again, there's SANs that do this for you. > > Another minor trick is to turn off access timestamp updates when you mount > the filesystem (noatime). > > Also benchmark different filesystems, there's major differences between them. > I've heard XFS being recommended, but I've never needed to benchmark for seek > times myself. We're using IBM's commercial GPFS here, which is good with > enormous amounts of huge files (media farm here), not sure how it'd fare with > smaller files. > > Hope that helps, > Johan > > ----- Original Message ----- >> From: "Chris Knipe" <sav...@savage.za.org> >> To: mysql@lists.mysql.com >> Sent: Thursday, 25 July, 2013 11:53:53 PM >> Subject: hypothetical question about data storage >> >> Hi all, >> >> We run an VERY io intensive file application service. Currently, our >> problem is that our disk spindles are being completely killed due to >> insufficient SEEK time on the hard drives (NOT physical read/write >> speeds). >> >> We have an directory structure where the files are stored based on >> the MD5 >> checksum of the file name, i.e. >> /0/00/000/000044533779fce5cf3497f87de1d060 >> The majority of these files, are between 256K and 800K with the ODD >> exception (say less than 15%) being more than 1M but no more than 5M >> in >> size. The content of the files are pure text (MIME Encoded). >> >> We believe that storing these files into an InnoDB table, may >> actually give >> us better performance: >> - There is one large file that is being read/written, instead of >> BILLIONS of >> small files >> - We can split the structure so that each directory (4096 in total) >> sit's on >> their own database >> - We can move the databases as load increases, which means that we >> can >> potentially run 2 physical database servers, each with 2048 databases >> each) >> - It's easy to move / migrate the data due to mysql and replication - >> same >> can be said for redundancy of the data >> >> We are more than likely looking at BLOB columns of course, and we >> need to >> read/write from the DB in excess of 100mbit/s >> >> Would the experts consider something like this as being feasible? Is >> it >> worth it to go down this avenue, or are we just going to run into >> different >> problems? If we are facing different problems, what can we possibly >> expect >> to go wrong here? >> >> Many thanks, and I look forward to any input. >> > > -- > Unhappiness is discouraged and will be corrected with kitten pictures. -- Regards, Chris Knipe -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql