When using the IBM sizing explained here : http://publib.boulder.ibm.com/infocenter/tsminfo/v6r2/topic/com.ibm.itsm.srv.install.doc/t_srv_inst_db_space.htmlyou will run into big problems when using dedupe on disk with mostly virtual machine backup. You will get amazing dedupe ratios but the sizing of the DB based almost enterly on filecount (as described by IBM) will fail in an EPIC way.
I think this is somewhat of an issue since it is not just a little of but it can be (and was) the difference in 10GB and 1TB+ TSM database, that's something some people would like to know in advance. The "Overhead can require up to 50% in additional space." remark should be something like "Overhead can require up to 100x your sizing result in additional space", that's keeping it real. When opening a call at IBM support I got a reply that doesn't help me much, what they told me was that I should not backup more than 10 or 12 virtual machines on a single TSM server that uses dedupe on the vm's data. Well, that's nice, so now I can dedupe the data, store more files on a TSM 6.2 instance but still need more instances of TSM because I can only backup 10-12 vm's on on instance due to metadata? That can't be right, I am sure it's not right, but the questions remains..what is right? What formula can I use to determine the size of the TSM database that can also calculate dedupe on a few very large files that I might want to keep for a few months. This issue just cost me a TSM client, so I am a little upset right now.
