Hi all, I have to estimate resource requirements for my hadoop/spark cluster. In particular, i have to query about 100tb of hbase table to do aggregation with spark sql.
What is, approximately, the most suitable cluster configuration for my use case? In order to query data in a fast way. At last i have to develope an online analytical application on these data. I would like to know what kind of nodes i have to configure to achieve the goal. How many RAM, cores, disks these nodes should have?? Thanks in advance, Best regards, Pietro