[
https://issues.apache.org/jira/browse/SLIDER-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301200#comment-14301200
]
Steve Loughran commented on SLIDER-764:
---------------------------------------
I like the idea of a percentage; moves away from "number needed to be able to
do work" to "number needed to sustain the workload expected"
> Allow specification of minimum number of live containers for each component
> type
> --------------------------------------------------------------------------------
>
> Key: SLIDER-764
> URL: https://issues.apache.org/jira/browse/SLIDER-764
> Project: Slider
> Issue Type: Task
> Reporter: Ted Yu
>
> While debugging a Slider-hbase deployment problem where client retrieved
> hbase-site.xml but verification of region server count failed, I found the
> following in SliderAppMaster log:
> {code}
> 2015-01-19 11:19:57,318 [AMRM Callback Handler Thread] INFO
> appmaster.SliderAppMaster (SliderAppMaster.java:onNodesUpdated(1603)) -
> Updated nodes [nodeId { host:
> "os-h2-2210-d6-sec-1421653828-hbase-slider3-1.hw.local" port: 45454 }
> httpAddress: "os-h2-2210-d6-sec-1421653828-hbase-slider3-1.hw.local:8044"
> rackName: "/default-rack" used { memory: 0 virtual_cores: 0 }
> capability { memory: 10240 virtual_cores: 8 } node_state: NS_UNHEALTHY
> health_report: "2/2 local-dirs are bad:
> /grid/0/yarn/local,/grid/1/yarn/local; 2/2 log- dirs are bad:
> /grid/0/yarn/log,/grid/1/yarn/log" last_health_report_time: 1421666370462]
> {code}
> In case there're not enough good nodes where requested number of components
> (such as region server) can be deployed, Slider shouldn't signal deployment
> success.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)