Jon,
Your recommendation worked. Thank you.
I'm currently running the following successfully:
* HDP 2.1.5 (Kerberized)
* Slider release-0.50.2-incubating-rc0
* Accumulo 1.6.0 (from app-packages)
* Storm 0.91 (from app-packages)
I haven't done any significant testing, but it seems to be working as
expected.
For the benefit of future readers some notes that may help you with some of
the hurdles I encountered while debugging:
I created a separate principal to run slider instead of using "yarn" as
most of the instructions specify because I didn't want to modify
container-executor.cfg
(which prevents certain users and uids from running jobs).
I also rearranged the layout of slider in HDFS which required adjusting
application.def.
Inside of appConfig.json, I changed site.global.app_user to my executing
user. I also set site.global.security_enabled to true.
-----Without the user and security settings, storm produce logs like
this-----
14/09/09 21:54:31 INFO agent.AgentProviderService: Start of NIMBUS on
container_1409855300917_0287_01_000002 delayed as dependencies have not
started.
14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
STORM_UI_SERVER START as dependency NIMBUS is INSTALLED
14/09/09 21:54:41 INFO agent.AgentProviderService: Start of
STORM_UI_SERVER on container_1409855300917_0287_01_000004 delayed as
dependencies have not started.
14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
DRPC_SERVER START as dependency NIMBUS is INSTALLED
14/09/09 21:54:41 INFO agent.AgentProviderService: Start of DRPC_SERVER
on container_1409855300917_0287_01_000005 delayed as dependencies have not
started.
14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
SUPERVISOR START as dependency NIMBUS is INSTALLED
14/09/09 21:54:41 INFO agent.AgentProviderService: Start of SUPERVISOR
on container_1409855300917_0287_01_000006 delayed as dependencies have not
started.
14/09/09 21:54:41 INFO agent.ComponentCommandOrder: Cannot schedule
NIMBUS START as dependency STORM_REST_API is INSTALL_FAILED
-----Without the user and security settings, accumulo produce logs like
this-----
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_MONITOR START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
ACCUMULO_MONITOR on container_1409855300917_0292_01_000004 delayed as
dependencies have not started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_GC START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_GC on
container_1409855300917_0292_01_000005 delayed as dependencies have not
started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_TRACER START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of ACCUMULO_TRACER
on container_1409855300917_0292_01_000006 delayed as dependencies have not
started.
14/09/09 22:49:18 INFO agent.ComponentCommandOrder: Cannot schedule
ACCUMULO_TSERVER START as dependency ACCUMULO_MASTER is INIT
14/09/09 22:49:18 INFO agent.AgentProviderService: Start of
ACCUMULO_TSERVER on container_1409855300917_0292_01_000003 delayed as
dependencies have not started.
14/09/09 22:49:18 INFO agent.AgentProviderService: Installing
ACCUMULO_MASTER on container_1409855300917_0292_01_000008.
14/09/09 22:49:18 INFO agent.AgentProviderService: Component operation.
Status: IN_PROGRESS
14/09/09 22:49:19 INFO agent.AgentProviderService: Component operation.
Status: COMPLETED
14/09/09 22:49:19 INFO agent.AgentProviderService: publishing
PublishedConfiguration{description='LogFolders' entries = 14}
14/09/09 22:49:19 INFO agent.AgentProviderService: Starting ACCUMULO_MASTER
on container_1409855300917_0292_01_000008.
14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
Status: IN_PROGRESS
14/09/09 22:49:21 INFO agent.AgentProviderService: Component operation.
Status: FAILED
Thanks,
Tim
On Tue, Sep 9, 2014 at 11:54 AM, Tim Israel <[email protected]> wrote:
> I will give that a shot. Thanks Jon.
>
> Tim
>
> On Tue, Sep 9, 2014 at 11:38 AM, Jon Maron <[email protected]> wrote:
>
>> I would try to use a newer version of Slider. I believe the issue you’re
>> encountering is SLIDER-266.
>>
>> — Jon
>>
>> On Sep 9, 2014, at 11:17 AM, Tim Israel <[email protected]> wrote:
>>
>> > Hi everyone,
>> >
>> > I've been trying to deploy storm and accumulo on slider on a kerberized
>> > cluster for the past few days.
>> >
>> > My issue seems identical to an issue that was posted by Jon Maron in
>> July (
>> >
>> http://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201407.mbox/%[email protected]%3E
>> > )
>> >
>> > Cluster : HDP 2.1.5 - Kerberized
>> > Slider Version : 0.30 (from HDP Slider Tech Preview) and 0.40 (Apache)
>> > app-packages: storm_v091 (tested in 0.30 and 0.40) and accumulo_v151
>> > (tested in 0.30 only)
>> >
>> >
>> > I get the same error for each app-package tested. I'm sending a partial
>> > stack trace below (I can send a more complete one if you're interested).
>> > It is identical to Jon Maron's.
>> >
>> > 14/09/09 13:50:53 ERROR appmaster.SliderAppMaster: Failed to start
>> > Container container_1409855300917_0090_01_000003
>> > org.apache.hadoop.yarn.exceptions.YarnException:
>> java.lang.NullPointerException
>> > at
>> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:41)
>> > at
>> org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:224)
>> >
>> > ...
>> >
>> > Any help would be greatly appreciated.
>> >
>> > Thanks,
>> >
>> > Tim
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>