Can you share some more inputs on requirement . What is the analytics usecase ? ( Batch Processing , Real Time , In-Memory Requirements ) Which distribution of Hadoop ? What is the storage growth rate ? What are the data ingest requirements ? What kind of jobs will run on the cluster ? What is the nature of data ? Is data compression applicable ? What is the HA requirements ? What is the performance expectations ?
Based on these requirements , you would have to design compute , storage and also network elements Thanks and Regards, Ashish Kumar IBM Systems BigData Analytics Solutions Architect From: Bhagaban Khatai <[email protected]> To: [email protected] Date: 05/29/2015 11:32 AM Subject: Cluster sizing Hi, I wanted to know how I can determine how many nodes with cores/storage in TB and RAM needed, if I will receieve the data volume increase from 1TB to 100TB per day. Can someone help me here to create a excel based on this. Thanks
