new to hadoop, jobs never leaving accepted

Jason Laughman Tue, 09 Jul 2019 17:23:31 -0700

I’ve been setting up a Hadoop 2.9.1 cluster and have data replicating through 
HDFS, but when I try to run a job via Hive (I see that it’s deprecated, but 
it’s what I’m working with for now) it never gets out of accepted state in the 
web tool.  I’ve done some Googling and the general consensus is that it’s 
resource constraints, so can someone tell me if I’ve got enough horsepower here?


I’ve got one small name server, three small data servers, and two larger data 
servers.  I figured out the the small data servers were too small because even 
if I tried to tweak YARN parameters for RAM and CPU the resource managers would 
immediately shutdown.  I added the two larger data servers, and now I see two 
active nodes but only with a total of one container:

$ yarn node -list
19/07/09 23:54:11 INFO client.RMProxy: Connecting to ResourceManager at 
<resource_manager>:8032
Total Nodes:2
         Node-Id             Node-State Node-Http-Address       
Number-of-Running-Containers
node1:40079             RUNNING node1:8042                                 1
node2:36311             RUNNING node2:8042                                 0

There are a ton of some sort of automated jobs backed up on there, and when I 
try to run anything through Hive it just sits there and eventually times out (I 
do see it get accepted).  My larger nodes are 4 GB RAM and 2 vcores and I set 
YARN to do automated resource allocation with 
yarn.nodemanager.resource.detect-hardware-capabilities.  Is that enough to even 
get a POC lab working?  I don’t care about having the three smaller servers 
running as resource nodes, but I’d like to have a better understanding of 
what’s going on with the larger servers, because it seems like they’re close to 
working.

Here’s the metrics data from the website, hopefully somebody can parse it.
Cluster Metrics
Apps Submitted  Apps Pending    Apps Running    Apps Completed  Containers 
Running      Memory Used     Memory Total    Memory Reserved VCores Used     
VCores Total    VCores Reserved
292     284     1       7       1       1 GB    3.38 GB 0 B     1       4       0
Cluster Nodes Metrics
Active Nodes    Decommissioning Nodes   Decommissioned Nodes    Lost Nodes      
Unhealthy Nodes Rebooted Nodes  Shutdown Nodes
2       0       0       0       0       0       4
Scheduler Metrics
Scheduler Type  Scheduling Resource Type        Minimum Allocation      Maximum 
Allocation      Maximum Cluster Application Priority
Capacity Scheduler      [MEMORY]        <memory:1024, vCores:1> <memory:1732, 
vCores:2> 0
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org

new to hadoop, jobs never leaving accepted

Reply via email to