Great thanks, Wen Lin, it is very helpful. Now, it works. :)

On Thu, Nov 26, 2015 at 5:13 PM, Wen Lin <[email protected]> wrote:

> Hi, Leon,
>
> First of all, the latest HAWQ use "hawq_global_rm_type" to indicate "NONE"
> mode or "YARN" mode(But this is not the reason of the failure below).
>
> The log you attached shows that HAWQ is trying to run in YARN mode, and
> attend to register itself to Hadoop Yarn Resource manager but failed.
> (If succeed, the Progress will be 50%, not 0%)
>
> Please open your yarn-site.xml to check if property
> yarn.resourcemanager.system-metrics-publisher.enabled is true or false.
> If property yarn.resourcemanager.system-metrics-publisher.enabled is true,
> HAWQ will failed to register it to Hadoop Yarn, the progress of Hawq is
> 0%(expected 50%). In the log file of Hadoop Yarn, a null pointer exception
> occurs, just like your exception.
> This similar to
> http://zh.hortonworks.com/community/forums/topic/error-in-handling-event-type-registered-for-applicationattempt/
>
> If yarn.resourcemanager.system-metrics-publisher.enabled is disable,
> the HAWQ can register itself to Yarn successfully.I haven't investigated
> the reason and don't know why the null pointer happens, just track it.
> If it is not because of
> yarn.resourcemanager.system-metrics-publisher.enabled in your environment,
> it maybe the other things cause a null pointer happen in Yarn.
>
> Thanks!
>
>
> On Thu, Nov 26, 2015 at 4:46 PM, Leon Zhang <[email protected]> wrote:
>
>> Thanks Daniel
>>
>>    After I switch "hawq_resourcemanager_server_type" to "yarn", I can see
>> the application now:
>>
>> $ yarn application -list
>>
>>
>>                 Application-Id      Application-Name
>>  Application-Type          User           Queue                   State
>>         Final-State             Progress
>>         Tracking-URL
>> application_1447985660182_0558                  hawq
>>  YARN       xiaolin         default                 RUNNING
>> UNDEFINED                   0%
>>                  url
>>
>>    But, my hawq application hang at RUNNING state. And the log shows:
>>
>>
>> 2015-11-26 16:40:16,186 INFO  security.AMRMTokenSecretManager
>> (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for
>> appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:16,187 INFO  attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(762)) - appattempt_1447985660182_0620_000001
>> State change from LAUNCHED_UNMANAGED_SAVING to LAUNCHED
>> 2015-11-26 16:40:17,193 INFO  ipc.Server (Server.java:saslProcess(1306))
>> - Auth successful for appattempt_1447985660182_0620_000001 (auth:SIMPLE)
>> 2015-11-26 16:40:17,194 INFO  resourcemanager.ApplicationMasterService
>> (ApplicationMasterService.java:registerApplicationMaster(274)) - AM
>> registration appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:17,194 INFO  resourcemanager.RMAuditLogger
>> (RMAuditLogger.java:logSuccess(127)) - USER=xiaolin IP=10.10.0.11
>> OPERATION=Register App Master   TARGET=ApplicationMasterService
>> RESULT=SUCCESS  APPID=application_1447985660182_0620
>>  APPATTEMPTID=appattempt_1447985660182_0620_000001
>> 2015-11-26 16:40:17,194 ERROR resourcemanager.ResourceManager
>> (ResourceManager.java:handle(851)) - Error in handling event type
>> REGISTERED for applicationAttempt application_1447985660182_0620
>> java.lang.NullPointerException
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher.appAttemptRegistered(SystemMetricsPublisher.java:143)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1365)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1341)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>> 2015-11-26 16:40:17,195 INFO  rmapp.RMAppImpl
>> (RMAppImpl.java:handle(718)) - application_1447985660182_0620 State change
>> from ACCEPTED to RUNNING
>> 2015-11-26 16:40:17,196 ERROR attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
>> state
>> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
>> event: STATUS_UPDATE at LAUNCHED
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>> 2015-11-26 16:40:22,197 ERROR attempt.RMAppAttemptImpl
>> (RMAppAttemptImpl.java:handle(757)) - Can't handle this event at current
>> state
>> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid
>> event: STATUS_UPDATE at LAUNCHED
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>         at
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:755)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:849)
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:830)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:266)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>>         at
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>> Any clue for this issue?
>>
>> Thanks in advance.
>>
>>
>> On Tue, Nov 24, 2015 at 12:48 AM, Daniel Lynch <[email protected]> wrote:
>>
>>> here is a working config example from my lab where hawq will execute in
>>> yarn
>>>
>>>
>>>
>>>
>>> $GPHOME/etc/hawq-site.xml
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <configuration>
>>>
>>>     <property>
>>>     <name>hawq_resourcemanager_query_noresource_timeout</name>
>>> <value>30</value>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_address_host</name>
>>>         <value>node2</value>
>>>         <description>The host name of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_address_port</name>
>>>         <value>2020</value>
>>>         <description>The port of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_address_port</name>
>>>         <value>40000</value>
>>>         <description>The port of hawq segment.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_dfs_url</name>
>>>         <value>node2:8020/hawq_default</value>
>>>         <description>URL for accessing HDFS.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_directory</name>
>>>         <value>/data/master</value>
>>>         <description>The directory of hawq master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_directory</name>
>>>         <value>/data/primary</value>
>>>         <description>The directory of hawq segment.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_master_temp_directory</name>
>>>         <value>/tmp</value>
>>>         <description>The temporary directory reserved for hawq
>>> master.</description>
>>>     </property>
>>>
>>>     <property>
>>>         <name>hawq_segment_temp_directory</name>
>>>         <value>/tmp</value>
>>>         <description>The temporary directory reserved for hawq
>>> segment.</description>
>>>     </property>
>>>
>>>     *<!-- HAWQ resource manager parameters -->*
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_server_type</name>*
>>> *        <value>yarn</value>*
>>> *        <description>The resource manager type to start for allocating
>>> resource.*
>>> *                     'none' means hawq resource manager exclusively
>>> uses whole*
>>> *                     cluster; 'yarn' means hawq resource manager
>>> contacts YARN*
>>> *                     resource manager to negotiate resource.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_segment_limit_memory_use</name>*
>>> *        <value>64GB</value>*
>>> *        <description>The limit of memory usage in a hawq segment when*
>>> *                     hawq_resourcemanager_server_type is set 'none'.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_segment_limit_core_use</name>*
>>> *        <value>16</value>*
>>> *        <description>The limit of virtual core usage in a hawq segment
>>> when*
>>> *                     hawq_resourcemanager_server_type is set 'none'.*
>>> *        </description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_resourcemanager_address</name>*
>>> *        <value>node3:8050</value>*
>>> *        <description>The address of YARN resource manager
>>> server.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *
>>> <name>hawq_resourcemanager_yarn_resourcemanager_scheduler_address</name>*
>>> *        <value>node3:8030</value>*
>>> *        <description>The address of YARN scheduler
>>> server.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_queue</name>*
>>> *        <value>default</value>*
>>> *        <description>The YARN queue name to register hawq resource
>>> manager.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>hawq_resourcemanager_yarn_application_name</name>*
>>> *        <value>hawq</value>*
>>> *        <description>The application name to register hawq resource
>>> manager in YARN.</description>*
>>> *    </property>*
>>>
>>> *    <property>*
>>> *        <name>default_segment_num</name>*
>>> *       <value>16</value>*
>>> *    </property>*
>>> *    <property>*
>>> *
>>> <name>hawq_resourcemanager_query_vsegment_number_per_segment_limit</name>*
>>> *       <value>8</value>*
>>> *    </property>*
>>> *</configuration>*
>>>
>>>
>>>
>>>
>>>
>>> Daniel Lynch
>>> Mon-Fri 9-5 PST
>>> Office: 408 780 4498
>>>
>>> On Sun, Nov 22, 2015 at 9:23 PM, Leon Zhang <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>>    Is there any tutorial about how to deploy latest HAWQ 2.0-beta on
>>>> YARN cluster?
>>>>    I just rebuild the latest code from git, and after "hawq init
>>>> cluster", it seems the segments does not work on YARN container. Any help
>>>> will be appreciated.
>>>>
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>
>

Reply via email to