Hi, Leon,

First of all, the latest HAWQ use "hawq_global_rm_type" to indicate "NONE"
mode or "YARN" mode(But this is not the reason of the failure below).

The log you attached shows that HAWQ is trying to run in YARN mode, and
attend to register itself to Hadoop Yarn Resource manager but failed.
(If succeed, the Progress will be 50%, not 0%)

Please open your yarn-site.xml to check if property
yarn.resourcemanager.system-metrics-publisher.enabled is true or false.
If property yarn.resourcemanager.system-metrics-publisher.enabled is true,
HAWQ will failed to register it to Hadoop Yarn, the progress of Hawq is
0%(expected 50%). In the log file of Hadoop Yarn, a null pointer exception
occurs, just like your exception.
This similar to
http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/

If yarn.resourcemanager.system-metrics-publisher.enabled is disable,
the HAWQ can register itself to Yarn successfully.I haven't investigated
the reason and don't know why the null pointer happens, just track it.
If it is not because of
yarn.resourcemanager.system-metrics-publisher.enabled in your environment,
it maybe the other things cause a null pointer happen in Yarn.

Thanks!


On Thu, Nov 26, 2015 at 4:46 PM, Leon Zhang <[email protected]> wrote:

> Thanks Daniel
>
>    After I switch "hawq_resourcemanager_server_type" to "yarn", I can see
> the application now:
>
> $ yarn application -list
>
>
>                 Application-Id      Application-Name
>  Application-Type          User           Queue                   State
>         Final-State             Progress
>         Tracking-URL
> application_1447985660182_0558                  hawq
>  YARN       xiaolin         default                 RUNNING
> UNDEFINED                   0%
>                  url
>
>    But, my hawq application hang at RUNNING state. And the log shows:
>
>
> 2015-11-26 16:40:16,186 INFO  security.AMRMTokenSecretManager
> (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for
> appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:16,187 INFO  attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001
> State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED
> 2015-11-26 16:40:17,193 INFO  ipc.Server (Server.java:saslProcess(1306)) -
> Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE)
> 2015-11-26 16:40:17,194 INFO  resourcemanager.ApplicationMasterService
> (ApplicationMasterService.java:registerApplicationMaster(274)) - AM
> registration appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:17,194 INFO  resourcemanager.RMAuditLogger
> (RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11
> OPERATION=Register App Master   TARGET=ApplicationMasterService
> RESULT=SUCCESS  APPID=application_1447985660182_0620
>  APPATTEMPTID=appattempt_1447985660182_0620_000001
> 2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager
> (ResourceManager.java:handle(851)) - Error in handling event type
> REGISTERED for applicationAttempt application_1447985660182_0620
> java.lang.NullPointerException
>         at
> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-26 16:40:17,195 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(718))
> - application_1447985660182_0620 State change from ACCEPTED to RUNNING
> 2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
> state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
> event: STATUS_UPDATE at LAUNCHED
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl
> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
> state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
> event: STATUS_UPDATE at LAUNCHED
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>         at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> Any clue for this issue?
>
> Thanks in advance.
>
>
> On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <[email protected]> wrote:
>
>> here is a working config example from my lab where hawq will execute in
>> yarn
>>
>>
>>
>>
>> $GPHOME/etc/hawq-site.xml
>> <?xml version="1.0" encoding="UTF-8"?>
>> <configuration>
>>
>>     <property>
>>     <name>hawq_resourcemanager_query_noresource_timeout</name>
>> <value>30</value>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_address_host</name>
>>         <value>node2</value>
>>         <description>The host name of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_address_port</name>
>>         <value>2020</value>
>>         <description>The port of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_address_port</name>
>>         <value>40000</value>
>>         <description>The port of hawq segment.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_dfs_url</name>
>>         <value>node2:8020/hawq_default</value>
>>         <description>URL for accessing HDFS.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_directory</name>
>>         <value>/data/master</value>
>>         <description>The directory of hawq master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_directory</name>
>>         <value>/data/primary</value>
>>         <description>The directory of hawq segment.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_master_temp_directory</name>
>>         <value>/tmp</value>
>>         <description>The temporary directory reserved for hawq
>> master.</description>
>>     </property>
>>
>>     <property>
>>         <name>hawq_segment_temp_directory</name>
>>         <value>/tmp</value>
>>         <description>The temporary directory reserved for hawq
>> segment.</description>
>>     </property>
>>
>>     *<!-- HAWQ resource manager parameters -->*
>> *    <property>*
>> *        <name>hawq_resourcemanager_server_type</name>*
>> *        <value>yarn</value>*
>> *        <description>The resource manager type to start for allocating
>> resource.*
>> *                     'none' means hawq resource manager exclusively uses
>> whole*
>> *                     cluster; 'yarn' means hawq resource manager
>> contacts YARN*
>> *                     resource manager to negotiate resource.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
>> *        <value>64GB</value>*
>> *        <description>The limit of memory usage in a hawq segment when*
>> *                     hawq_resourcemanager_server_type is set 'none'.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_segment_limit_core_use</name>*
>> *        <value>16</value>*
>> *        <description>The limit of virtual core usage in a hawq segment
>> when*
>> *                     hawq_resourcemanager_server_type is set 'none'.*
>> *        </description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
>> *        <value>node3:8050</value>*
>> *        <description>The address of YARN resource manager
>> server.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *
>> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
>> *        <value>node3:8030</value>*
>> *        <description>The address of YARN scheduler server.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_queue</name>*
>> *        <value>default</value>*
>> *        <description>The YARN queue name to register hawq resource
>> manager.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>hawq_resourcemanager_yarn_application_name</name>*
>> *        <value>hawq</value>*
>> *        <description>The application name to register hawq resource
>> manager in YARN.</description>*
>> *    </property>*
>>
>> *    <property>*
>> *        <name>default_segment_num</name>*
>> *       <value>16</value>*
>> *    </property>*
>> *    <property>*
>> *
>> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
>> *       <value>8</value>*
>> *    </property>*
>> *</configuration>*
>>
>>
>>
>>
>>
>> Daniel Lynch
>> Mon-Fri 9-5 PST
>> Office: 408 780 4498
>>
>> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <[email protected]> wrote:
>>
>>> Hi,
>>>
>>>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on
>>> YARN cluster?
>>>    I just rebuild the latest code from git, and after "hawq init
>>> cluster", it seems the segments does not work on YARN container. Any help
>>> will be appreciated.
>>>
>>>
>>> Thanks.
>>>
>>
>>
>

Reply via email to