On 21/10/2018 09:25, Amine Tengilimoglu wrote: > Hi all; > > I want to learn how can i estimate the hardware nedeed for hadoop > cluster. is there any standart or other things? > > for example I have 10TB data, and i will analiyze it... My replication > factor will be 2. > > How much ram do i need for one node? how can I estimate it? > How much disk do i need for one node ? how can I estimate it? > How many core - CPU do i need for one node? > > > thanks in advance.. >
Hi, there are some docs on HDP docs website, I think that also Cloudera and other companies should have something similar, the HDP document is called "Cluster Planning". There are some rules of thumb, but if you want to be precise, you need to know the services that you will run, how much resources they will need, and the expected performances. To do a good estimation you need a person that have a good understanding of the services that are running into the cluster, and the right input about what you are going to do with the cluster.