I have no answer to your questions , but have some questions though ! What tables are you talking about ? Considering you are talking about datasets/files when you say tables , why using hadoop for such some sized tables.
On Mon, Aug 20, 2012 at 6:33 PM, Mahsa Mofidpoor <[email protected]>wrote: > Hello, > > I run a simple join (select col_list from table1 join table2 on > (join_condition)) on both single-node and multi-nodes setup. The table > sizes are 1.7 MB and 4.2 MB respectively. It takes more time to execute > the query on the cluster then to run it on a single-node hadoop setup. > I checked to map logs and I saw that both mappings happen on the master > node. > Do I need to increase the data in order to benefit from the multi-nodes > capacity? > How can I make sure that my data is distributed on all the nodes? > > Thank you in advance for your assistance. > > Reagrds, > Mahsa >
