Dear All,
Looking forward to your views on the maximum limit of HBase cluster size. We are currently designing a HBase cluster and one of the tables (designed in wide format) is expected to have roughly 6 billion rows in production by 3 years (with an additional 200 million rows getting added each month). In addition, we are expecting roughly 250 columns per row. Expected table data volume is around 250 TB (at end of 3 years, without considering HDFS replication) and growing by 7 TB per month. While we are provisioning the number of nodes based on expected data volume, wanted to check if there are any limits on the number of rows per cluster. Will it be advisable to split the cluster in such situation into two or more independent clusters? Will there be any impact to the read/write throughput/latency as the table grows over time? Please advise. Regards, Sreeram
