Great thanks, Wen Lin, it is very helpful. Now, it works. :) On Thu, Nov 26, 2015 at 5:13 PM, Wen Lin <[email protected]> wrote:
> Hi, Leon, > > First of all, the latest HAWQ use "hawq_global_rm_type" to indicate "NONE" > mode or "YARN" mode(But this is not the reason of the failure below). > > The log you attached shows that HAWQ is trying to run in YARN mode, and > attend to register itself to Hadoop Yarn Resource manager but failed. > (If succeed, the Progress will be 50%, not 0%) > > Please open your yarn-site.xml to check if property > yarn.resourcemanager.system-metrics-publisher.enabled is true or false. > If property yarn.resourcemanager.system-metrics-publisher.enabled is true, > HAWQ will failed to register it to Hadoop Yarn, the progress of Hawq is > 0%(expected 50%). In the log file of Hadoop Yarn, a null pointer exception > occurs, just like your exception. > This similar to > http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/ > > If yarn.resourcemanager.system-metrics-publisher.enabled is disable, > the HAWQ can register itself to Yarn successfully.I haven't investigated > the reason and don't know why the null pointer happens, just track it. > If it is not because of > yarn.resourcemanager.system-metrics-publisher.enabled in your environment, > it maybe the other things cause a null pointer happen in Yarn. > > Thanks! > > > On Thu, Nov 26, 2015 at 4:46 PM, Leon Zhang <[email protected]> wrote: > >> Thanks Daniel >> >> After I switch "hawq_resourcemanager_server_type" to "yarn", I can see >> the application now: >> >> $ yarn application -list >> >> >> Application-Id Application-Name >> Application-Type User Queue State >> Final-State Progress >> Tracking-URL >> application_1447985660182_0558 hawq >> YARN xiaolin default RUNNING >> UNDEFINED 0% >> url >> >> But, my hawq application hang at RUNNING state. And the log shows: >> >> >> 2015-11-26 16:40:16,186 INFO security.AMRMTokenSecretManager >> (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for >> appattempt_1447985660182_0620_000001 >> 2015-11-26 16:40:16,187 INFO attempt.RMAppAttemptImpl >> (RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001 >> State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED >> 2015-11-26 16:40:17,193 INFO ipc.Server (Server.java:saslProcess(1306)) >> - Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE) >> 2015-11-26 16:40:17,194 INFO resourcemanager.ApplicationMasterService >> (ApplicationMasterService.java:registerApplicationMaster(274)) - AM >> registration appattempt_1447985660182_0620_000001 >> 2015-11-26 16:40:17,194 INFO resourcemanager.RMAuditLogger >> (RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11 >> OPERATION=Register App Master TARGET=ApplicationMasterService >> RESULT=SUCCESS APPID=application_1447985660182_0620 >> APPATTEMPTID=appattempt_1447985660182_0620_000001 >> 2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager >> (ResourceManager.java:handle(851)) - Error in handling event type >> REGISTERED for applicationAttempt application_1447985660182_0620 >> java.lang.NullPointerException >> at >> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) >> at java.lang.Thread.run(Thread.java:745) >> 2015-11-26 16:40:17,195 INFO rmapp.RMAppImpl >> (RMAppImpl.java:handle(718)) - application_1447985660182_0620 State change >> from ACCEPTED to RUNNING >> 2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl >> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current >> state >> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid >> event: STATUS_UPDATE at LAUNCHED >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) >> at java.lang.Thread.run(Thread.java:745) >> 2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl >> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current >> state >> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid >> event: STATUS_UPDATE at LAUNCHED >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) >> at >> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755) >> at >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849) >> at >> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) >> at >> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) >> at java.lang.Thread.run(Thread.java:745) >> >> >> Any clue for this issue? >> >> Thanks in advance. >> >> >> On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <[email protected]> wrote: >> >>> here is a working config example from my lab where hawq will execute in >>> yarn >>> >>> >>> >>> >>> $GPHOME/etc/hawq-site.xml >>> <?xml version="1.0" encoding="UTF-8"?> >>> <configuration> >>> >>> <property> >>> <name>hawq_resourcemanager_query_noresource_timeout</name> >>> <value>30</value> >>> </property> >>> >>> <property> >>> <name>hawq_master_address_host</name> >>> <value>node2</value> >>> <description>The host name of hawq master.</description> >>> </property> >>> >>> <property> >>> <name>hawq_master_address_port</name> >>> <value>2020</value> >>> <description>The port of hawq master.</description> >>> </property> >>> >>> <property> >>> <name>hawq_segment_address_port</name> >>> <value>40000</value> >>> <description>The port of hawq segment.</description> >>> </property> >>> >>> <property> >>> <name>hawq_dfs_url</name> >>> <value>node2:8020/hawq_default</value> >>> <description>URL for accessing HDFS.</description> >>> </property> >>> >>> <property> >>> <name>hawq_master_directory</name> >>> <value>/data/master</value> >>> <description>The directory of hawq master.</description> >>> </property> >>> >>> <property> >>> <name>hawq_segment_directory</name> >>> <value>/data/primary</value> >>> <description>The directory of hawq segment.</description> >>> </property> >>> >>> <property> >>> <name>hawq_master_temp_directory</name> >>> <value>/tmp</value> >>> <description>The temporary directory reserved for hawq >>> master.</description> >>> </property> >>> >>> <property> >>> <name>hawq_segment_temp_directory</name> >>> <value>/tmp</value> >>> <description>The temporary directory reserved for hawq >>> segment.</description> >>> </property> >>> >>> *<!-- HAWQ resource manager parameters -->* >>> * <property>* >>> * <name>hawq_resourcemanager_server_type</name>* >>> * <value>yarn</value>* >>> * <description>The resource manager type to start for allocating >>> resource.* >>> * 'none' means hawq resource manager exclusively >>> uses whole* >>> * cluster; 'yarn' means hawq resource manager >>> contacts YARN* >>> * resource manager to negotiate resource.* >>> * </description>* >>> * </property>* >>> >>> * <property>* >>> * <name>hawq_resourcemanager_segment_limit_memory_use</name>* >>> * <value>64GB</value>* >>> * <description>The limit of memory usage in a hawq segment when* >>> * hawq_resourcemanager_server_type is set 'none'.* >>> * </description>* >>> * </property>* >>> >>> * <property>* >>> * <name>hawq_resourcemanager_segment_limit_core_use</name>* >>> * <value>16</value>* >>> * <description>The limit of virtual core usage in a hawq segment >>> when* >>> * hawq_resourcemanager_server_type is set 'none'.* >>> * </description>* >>> * </property>* >>> >>> * <property>* >>> * <name>hawq_resourcemanager_yarn_resourcemanager_address</name>* >>> * <value>node3:8050</value>* >>> * <description>The address of YARN resource manager >>> server.</description>* >>> * </property>* >>> >>> * <property>* >>> * >>> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>* >>> * <value>node3:8030</value>* >>> * <description>The address of YARN scheduler >>> server.</description>* >>> * </property>* >>> >>> * <property>* >>> * <name>hawq_resourcemanager_yarn_queue</name>* >>> * <value>default</value>* >>> * <description>The YARN queue name to register hawq resource >>> manager.</description>* >>> * </property>* >>> >>> * <property>* >>> * <name>hawq_resourcemanager_yarn_application_name</name>* >>> * <value>hawq</value>* >>> * <description>The application name to register hawq resource >>> manager in YARN.</description>* >>> * </property>* >>> >>> * <property>* >>> * <name>default_segment_num</name>* >>> * <value>16</value>* >>> * </property>* >>> * <property>* >>> * >>> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>* >>> * <value>8</value>* >>> * </property>* >>> *</configuration>* >>> >>> >>> >>> >>> >>> Daniel Lynch >>> Mon-Fri 9-5 PST >>> Office: 408 780 4498 >>> >>> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> Is there any tutorial about how to deploy latest HAWQ 2.0-beta on >>>> YARN cluster? >>>> I just rebuild the latest code from git, and after "hawq init >>>> cluster", it seems the segments does not work on YARN container. Any help >>>> will be appreciated. >>>> >>>> >>>> Thanks. >>>> >>> >>> >> >
