Thnaks Saurabh On Mon, Aug 20, 2012 at 12:15 PM, Saurabh bhutyani <[email protected]>wrote:
> Dear Mahsa, > > You need to increase the data size to benefit out of Hadoop. Basically > hadoop creates splits based on the configured value. The default being > 64MB. So if your data size is less than 64MB it would basically run only 1 > MR job. > > Thanks & Regards, > Saurabh Bhutyani > > Call : 9820083104 > Gtalk: [email protected] > > > > On Mon, Aug 20, 2012 at 6:33 PM, Mahsa Mofidpoor <[email protected]>wrote: > >> Hello, >> >> I run a simple join (select col_list from table1 join table2 on >> (join_condition)) on both single-node and multi-nodes setup. The table >> sizes are 1.7 MB and 4.2 MB respectively. It takes more time to execute >> the query on the cluster then to run it on a single-node hadoop setup. >> I checked to map logs and I saw that both mappings happen on the master >> node. >> Do I need to increase the data in order to benefit from the multi-nodes >> capacity? >> How can I make sure that my data is distributed on all the nodes? >> >> Thank you in advance for your assistance. >> >> Reagrds, >> Mahsa >> > >
