Thnx for your reply , i am new to hadoop and hive .My goal is to process a big data using hadoop,this is my university project ( Data Mining ) ,need to show that hadoop is better than mysql in case of Big data(30-100GB+) Processing,i know hadoop does that.To do so,can you please suggest me,how many node is required to show the performance and what type of configuration is required for each node.
From: [email protected] To: [email protected] CC: [email protected] Date: Tue, 12 Mar 2013 10:40:33 +0100 Subject: RE: Getting Slow Query Performance! Generally a single hadoop machine will perform worse then a single mysql machine. People normally use hadoop when they have so much data it won't really fit on a single machine and it would require specialized hardware (Stuff like SAN's) to run.30GB of data really isn't that much and 2GB of ram is really not what hadoop is designed to work on. It really likes to have lots of memory.I also don't see the hadoop configuration files so perhaps you only have 1 mapper and 1 reducer. But this is not a typical use-case so I doubt you'll see snappy performance after tweaking the configs.
