Hello everyone,

I am new to Hadoop and would like to see if I'm on the right track.
Currently I'm developing an application which would ingest logs of order of 
60-70 GB of data/day and would then do
Some analysis on them
Now the infrastructure that I have is a 4 node cluster( all nodes on Virtual 
Machines) , all nodes have 4GB ram.

But when I try to run the dataset (which is a sample dataset at this point ) of 
about 30 GB, it takes about 3 hrs to process all of it.

I would like to know is it normal for this kind of infrastructure to take this 
amount of time.


Thank you

Nikhil Kandoi/

Reply via email to