Do the logs on the three nodes contain anything interesting? Chris On Jan 9, 2014 3:47 AM, "Ashish Jain" <[email protected]> wrote:
> Here is the block info for the record I distributed. As can be seen only > 10.12.11.210 has all the data and this is the node which is serving all the > request. Replicas are available with 209 as well as 210 > > 1073741857: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > 1073741858: 10.12.11.210:50010 View Block Info > 10.12.11.211:50010 View Block Info > 1073741859: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > 1073741860: 10.12.11.210:50010 View Block Info > 10.12.11.211:50010 View Block Info > 1073741861: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > 1073741862: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > 1073741863: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > 1073741864: 10.12.11.210:50010 View Block Info > 10.12.11.209:50010 View Block Info > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --Ashish > > > On Thu, Jan 9, 2014 at 2:11 PM, Ashish Jain <[email protected]> wrote: > >> Hello Chris, >> >> I have now a cluster with 3 nodes and replication factor being 2. When I >> distribute a file I could see that there are replica of data available in >> other nodes. However when I run a map reduce job again only one node is >> serving all the request :(. Can you or anyone please provide some more >> inputs. >> >> Thanks >> Ashish >> >> >> On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <[email protected]>wrote: >> >>> 2 nodes and replication factor of 2 results in a replica of each block >>> present on each node. This would allow the possibility that a single node >>> would do the work and yet be data local. It will probably happen if that >>> single node has the needed capacity. More nodes than the replication >>> factor are needed to force distribution of the processing. >>> Chris >>> On Jan 8, 2014 7:35 AM, "Ashish Jain" <[email protected]> wrote: >>> >>>> Guys, >>>> >>>> I am sure that only one node is being used. I just know ran the job >>>> again and could see that CPU usage only for one server going high other >>>> server CPU usage remains constant and hence it means other node is not >>>> being used. Can someone help me to debug this issue? >>>> >>>> ++Ashish >>>> >>>> >>>> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <[email protected]> wrote: >>>> >>>>> Hello All, >>>>> >>>>> I have a 2 node hadoop cluster running with a replication factor of 2. >>>>> I have a file of size around 1 GB which when copied to HDFS is replicated >>>>> to both the nodes. Seeing the block info I can see the file has been >>>>> subdivided into 8 parts which means it has been subdivided into 8 blocks >>>>> each of size 128 MB. I use this file as input to run the word count >>>>> program. Some how I feel only one node is doing all the work and the code >>>>> is not distributed to other node. How can I make sure code is distributed >>>>> to both the nodes? Also is there a log or GUI which can be used for this? >>>>> Please note I am using the latest stable release that is 2.2.0. >>>>> >>>>> ++Ashish >>>>> >>>> >>>> >> >
