Elen Chatikyan created YARN-11666: ------------------------------------- Summary: NullPointerException in TestSLSRunner.testSimulatorRunning Key: YARN-11666 URL: https://issues.apache.org/jira/browse/YARN-11666 Project: Hadoop YARN Issue Type: Bug Environment: {*}Operating System{*}: macOS (Sanoma 14.2.1 (23C71))
{*}Hardware{*}: MacBook Air 2023 {*}IDE{*}: IntelliJ IDEA (2023.3.2 (Ultimate Edition)) {*}Java Version{*}: OpenJDK version "1.8.0_292" Reporter: Elen Chatikyan *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} _______________________________________________________________________ _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org