1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are running, I try to list them with "hadoop jobs -list", but it takes lots of time for the command being executed. This happens because of the performance of the VM. I just wonder how it works with big machines. Does anyone have an idea if it takes long to launch Hadoop commands while executing jobs.
*>> Get job information involves communication with resource mager/application Master. Because of available resource(CPU,Memory) in your VM is too less. may hadoop command taking long time to get job information.*2 - I want to run several jobs at the same time. How can I configure the maximum number of jobs that I can run at the same time? *>> Once you submit you job to RM, scheduler will decide how to run your job based on scheduler you used to run jobs and resource availability in your cluster. you have to write or customize scheduler to control the submission order or number of jobs to run at any instance. * 3 - Is there a calculation of how many jobs I can run at the same time for specific environment similar to how many reduces should we set in our jobs? *>> If you have clear idea about how much of data your going process in your jobs, how much of resource it going to use, how much of total resource available in cluster then you can define how many jobs can run at instance of time. It possible when are going handle only fixed data set in all cycles, in real environment it not possible calculate these thing for each job in each run.** In hadoop2 RM takes care all resource mangemnt, you need not to take special care about all these things. if need ordere process of jobs then you look no Oozie kind of tool to control over order of MR jobs.* On Mon, Jan 27, 2014 at 11:09 AM, xeon <[email protected]> wrote: > Hi, > > 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are > running, I try to list them with "hadoop jobs -list", but it takes lots > of time for the command being executed. This happens because of the > performance of the VM. I just wonder how it works with big machines. > Does anyone have an idea if it takes long to launch Hadoop commands > while executing jobs? > > > 2 - I want to run several jobs at the same time. How can I configure > the maximum number of jobs that I can run at the same time? > > > 3 - Is there a calculation of how many jobs I can run at the same time > for specific environment similar to how many reduces should we set in > our jobs? > > Thanks, > > -- > Best regards, > -- Regards, ...Sudhakara.st
