Hi Greg,
Maintenance mode has been implemented to exclude specific resources that
are in MM from cluster-wide/host-wide/service-wide operations. That means,
for example, that starting entire service should not affect few host
components that are in MM. Operation level allows Ambari to understand how
high-level operation is. Valid values are
CLUSTER/SERVICE/HOST/HOST_COMPONENT. In some cases operation level may be
guessed automatically from resource filter, but it's better to provide it
explicitly.
If operation level of an entire operation is higher then level of some MM
resource, this resource is excluded from an operation.
For example, let HDFS service be in MM state. If we perform some operation
on NN/DN components and pass operation_level=CLUSTER, components will not
be affected (they inherit MM state from the service). If we perform some
operation on NN/DN components and pass operation_level=SERVICE (or
HOST_COMPONENT), components will be affected.
The hierarchy is:
SERVICE
CLUSTER< > HOST_COMPONENT.
HOST
Picture above means that SERVICE and HOST are independent operation levels,
but both are inherited by host components.
Thanks,
Dmitry
On Fri, Nov 7, 2014 at 5:55 PM, Greg Hill <[email protected]> wrote:
> The host is in maintenance mode, but the components are not stopped. I'm
> setting maintenance mode prior to stopping services because otherwise you
> get nagios notifications when the components are stopped (or this used to
> be the case anyway).
>
> Adding the operation_level made everything work correctly. It returned a
> request, and the components were stopped after the request finished, at
> which point I was then able to remove the components from the host (this
> is where it was failing previously because the components were not
> stopped).
>
> This is my new request body:
>
> "RequestInfo": {
> "context": "Stop All Components",
> "operation_level": {
> "level": "HOST",
> "cluster_name": self.cluster_name,
> "host_name": self.host_name,
> },
> },
> "Body": {
> "HostRoles": {"state": "INSTALLED"},
> }
>
> Thanks for the help, although the behavior still confuses me a little.
> Why would it be prevented in maintenance mode when that's presumably the
> reason maintenance mode exists (to be able to muck about with things
> without getting false alarms)? Maybe I misunderstand what maintenance
> mode is for?
>
> Greg
>
>
> On 11/7/14 9:33 AM, "Yusaku Sako" <[email protected]> wrote:
>
> >Hi Greg,
> >
> >The API call you mentioned to stop all components on a host still
> >works in 1.7.0 (I just verified on my recent 1.7.0 cluster).
> >Operation_level is not mandatory and the WARN can be ignored.
> >Operation_level drives the behavior of operations when
> >services/hosts/host_components are in maintenance mode.
> >Unfortunately I don't see any documentation on this.
> >I presume you are getting 200 because all components on the specified
> >host are already stopped.
> >
> >Yusaku
> >
> >On Fri, Nov 7, 2014 at 5:55 AM, Greg Hill <[email protected]>
> wrote:
> >> This used to work in earlier 1.7.0 builds, but doesn't seem to any
> >>longer:
> >>
> >> PUT
> >>
> >>/api/v1/clusters/testcluster/hosts/
> c6404.ambari.apache.org/host_component
> >>s
> >> {"RequestInfo": {"context": "Stop All Components"}, "Body":
> >>{"HostRoles":
> >> {"state": "INSTALLED"}}}
> >>
> >> Seeing this in the server logs:
> >> 13:05:42,082 WARN [qtp1842914725-24]
> >>AmbariManagementControllerImpl:2149 -
> >> Can not determine request operation level. Operation level property
> >>should
> >> be specified for this request.
> >> 13:05:42,082 INFO [qtp1842914725-24]
> >>AmbariManagementControllerImpl:2162 -
> >> Received a updateHostComponent request, clusterName=testcluster,
> >> serviceName=HDFS, componentName=DATANODE,
> >>hostname=c6404.ambari.apache.org,
> >> request={ clusterName=testcluster, serviceName=HDFS,
> >>componentName=DATANODE,
> >> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
> >> desiredStackId=null, staleConfig=null, adminState=null}
> >> 13:05:42,083 INFO [qtp1842914725-24]
> >>AmbariManagementControllerImpl:2162 -
> >> Received a updateHostComponent request, clusterName=testcluster,
> >> serviceName=GANGLIA, componentName=GANGLIA_MONITOR,
> >> hostname=c6404.ambari.apache.org, request={ clusterName=testcluster,
> >> serviceName=GANGLIA, componentName=GANGLIA_MONITOR,
> >> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
> >> desiredStackId=null, staleConfig=null, adminState=null}
> >> 13:05:42,083 INFO [qtp1842914725-24]
> >>AmbariManagementControllerImpl:2162 -
> >> Received a updateHostComponent request, clusterName=testcluster,
> >> serviceName=YARN, componentName=NODEMANAGER,
> >> hostname=c6404.ambari.apache.org, request={ clusterName=testcluster,
> >> serviceName=YARN, componentName=NODEMANAGER,
> >> hostname=c6404.ambari.apache.org, desiredState=INSTALLED,
> >> desiredStackId=null, staleConfig=null, adminState=null}
> >>
> >> But I get an empty response with status 200 and no request was created.
> >> Shouldn't that be an error if it can't act on my request?
> >>
> >> Are there some docs about how to formulate the 'operation level' part
> >>of the
> >> request?
> >>
> >> Greg
> >>
> >
> >--
> >CONFIDENTIALITY NOTICE
> >NOTICE: This message is intended for the use of the individual or entity
> >to
> >which it is addressed and may contain information that is confidential,
> >privileged and exempt from disclosure under applicable law. If the reader
> >of this message is not the intended recipient, you are hereby notified
> >that
> >any printing, copying, dissemination, distribution, disclosure or
> >forwarding of this communication is strictly prohibited. If you have
> >received this communication in error, please contact the sender
> >immediately
> >and delete it from your system. Thank You.
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.