Re: Issue with Ambari Metrics Collector - Distributed mode

Vijaya Narayana Reddy Bhoomi Reddy Fri, 23 Oct 2015 08:16:25 -0700

Thanks Jonathan.

Regards
Vijay


> On 23 Oct 2015, at 16:10, Jonathan Hurley <[email protected]> wrote:
> 
> First you need to get the ID of the alert definition in your system
> GET 
> api/v1/clusters/<cluster>/alert_definitions?AlertDefinition/name=ambari_agent_disk_usage
> 
> Once you have the ID, you can do a PUT:
> PUT api/v1/clusters/<cluster>/alert_definitions/<id>
> 
> {
>   "AlertDefinition" : {
>     "source" : {
>       "parameters" : [
>         {
>           "name" : "minimum.free.space",
>           "display_name" : "Minimum Free Space",
>           "units" : "bytes",
>           "value" : 5.0E9,
>           "description" : "The overall amount of free disk space left before 
> an alert is triggered.",
>           "type" : "NUMERIC"
>         },
>         {
>           "name" : "percent.used.space.warning.threshold",
>           "display_name" : "Warning",
>           "units" : "%",
>           "value" : 0.8,
>           "description" : "The percent of disk space consumed before a 
> warning is triggered.",
>           "type" : "PERCENT"
>         },
>         {
>           "name" : "percent.free.space.critical.threshold",
>           "display_name" : "Critical",
>           "units" : "%",
>           "value" : 0.9,
>           "description" : "The percent of disk space consumed before a 
> critical alert is triggered.",
>           "type" : "PERCENT"
>         }
>       ],
>       "path" : "alert_disk_space.py",
>       "type" : "SCRIPT"
>     }
>   }
> }
> 
> This changes the thresholds to 80% for warning and 90% for critical
> 
>> On Oct 23, 2015, at 10:45 AM, Vijaya Narayana Reddy Bhoomi Reddy 
>> <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Thanks Jonathan for your reply.
>> 
>> Can you please let me know the API version modifying the threshold values?
>> 
>> Regards
>> Vijay
>> 
>> 
>>> On 23 Oct 2015, at 15:24, Jonathan Hurley <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> The ambari disk usage alerts are meant to check two things: that you have 
>>> have enough space total and percent free space in /usr/hdp for data created 
>>> by hadoop and for installing versioned RPMs. Total free space alerts are 
>>> something that you’ll probably want to fix since it means you have less 
>>> than a certain amount of total free space left.
>>> 
>>> It seems like you’re talking about percent free space. Those can be changed 
>>> via the thresholds that the script uses. You can’t do this through the 
>>> Ambari Web Client. You have two options:
>>> 
>>> - Use the Ambari APIs to adjust the threshold values - this command is 
>>> rather long; let me know if you want to try this and I can paste the code 
>>> to do it.
>>> 
>>> - Edit the script directly and set the defaults to higher limits: 
>>> https://github.com/apache/ambari/blob/branch-2.1/ambari-server/src/main/resources/host_scripts/alert_disk_space.py#L36-L37
>>>  
>>> <https://github.com/apache/ambari/blob/branch-2.1/ambari-server/src/main/resources/host_scripts/alert_disk_space.py#L36-L37>
>>> 
>>> 
>>>> On Oct 23, 2015, at 9:26 AM, Vijaya Narayana Reddy Bhoomi Reddy 
>>>> <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> 
>>>> Siddharth,
>>>> 
>>>> Thanks for your response. As ours was a 4 node cluster, I changed it to 
>>>> Embedded mode from distributed mode and is working fine. However, I am 
>>>> facing another issue with regards to Ambari agent disk usage alerts. 
>>>> Earlier, I had three alerts for three machines where /usr/hdp is utilised 
>>>> more than 50%.
>>>> 
>>>> Initially when I setup the cluster, I had multiple mount points listed 
>>>> under yarn.nodemanager.local-dirs and yarn.nodemaneger.log-dirs. /usr/hdp 
>>>> was one amor them Later, I changed these values such that only one value 
>>>> is present for these (/export/hadoop/yarn/local and 
>>>> /export/hadoop/yarn/log) and restarted the required components.
>>>> 
>>>> However, I am still seeing the Ambari disk usage alert for /usr/hdp. Can 
>>>> you please let me know how to get rid of these alerts?
>>>> 
>>>> Thanks 
>>>> Vijay
>>>> 
>>>> 
>>>>> On 22 Oct 2015, at 19:02, Siddharth Wagle <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>> 
>>>>> Hi Vijaya,
>>>>> 
>>>>> Please make all of the configs are accurate. 
>>>>> (https://cwiki.apache.org/confluence/display/AMBARI/AMS+-+distributed+mode
>>>>>  
>>>>> <https://cwiki.apache.org/confluence/display/AMBARI/AMS+-+distributed+mode>)
>>>>> 
>>>>> Can you attach, your ams-site.xml and /etc/ams-hbase/conf/hbase-site.xml ?
>>>>> 
>>>>> - Sid
>>>>> 
>>>>> ________________________________________
>>>>> From: Vijaya Narayana Reddy Bhoomi Reddy 
>>>>> <[email protected] 
>>>>> <mailto:[email protected]>>
>>>>> Sent: Thursday, October 22, 2015 8:36 AM
>>>>> To: [email protected] <mailto:[email protected]>
>>>>> Subject: Issue with Ambari Metrics Collector - Distributed mode
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am facing an issue while setting up Ambari Metrics in distributed mode. 
>>>>> I am setting up HDP 2.3.x using Ambari 2.1.x. Initially when I was 
>>>>> setting up the cluster, I was shown a warning message that the volume / 
>>>>> directory for metrics service  is same as the one used by datanode and 
>>>>> hence I was recommended to change it. So I went ahead and pointed it to 
>>>>> hdfs, trying to setting up metrics service in distributed mode.
>>>>> 
>>>>> However, Ambari Metrics service is not set up properly and it timed out 
>>>>> while setting up the cluster, showing a warning that Ambari Metrics 
>>>>> service hasn’t started. I restarted the Metrics collector service 
>>>>> multiple times, but it would stop again in a few seconds.
>>>>> 
>>>>> On further observation, I realised that in the ams-site.xml file, 
>>>>> timeline.metrics.service.operation.mode was still pointing to “embedded", 
>>>>> where as hbase-site.xml had all the required properties set correctly. So 
>>>>> I changed the timeline.metrics.service.operation.mode property to 
>>>>> “distributed” and restarted the required services as recommended by 
>>>>> Ambari. However, the restart process is stuck at 68% and eventually timed 
>>>>> out. Its not able to restart the Metrics Collector service. However, all 
>>>>> the metrics monitor services are re-started without any issues.
>>>>> 
>>>>> Can anyone please throw light on why this happening and what is the 
>>>>> solution to fix this?
>>>>> 
>>>>> Thanks
>>>>> Vijay
>>>>> --
>>>>> The contents of this e-mail are confidential and for the exclusive use of
>>>>> the intended recipient. If you receive this e-mail in error please delete
>>>>> it from your system immediately and notify us either by e-mail or
>>>>> telephone. You should not copy, forward or otherwise disclose the content
>>>>> of the e-mail. The views expressed in this communication may not
>>>>> necessarily be the view held by WHISHWORKS.
>>>>> 
>>>> 
>>>> 
>>>> The contents of this e-mail are confidential and for the exclusive use of 
>>>> the intended recipient. If you receive this e-mail in error please delete 
>>>> it from your system immediately and notify us either by e-mail or 
>>>> telephone. You should not copy, forward or otherwise disclose the content 
>>>> of the e-mail. The views expressed in this communication may not 
>>>> necessarily be the view held by WHISHWORKS.
>>> 
>> 
>> 
>> The contents of this e-mail are confidential and for the exclusive use of 
>> the intended recipient. If you receive this e-mail in error please delete it 
>> from your system immediately and notify us either by e-mail or telephone. 
>> You should not copy, forward or otherwise disclose the content of the 
>> e-mail. The views expressed in this communication may not necessarily be the 
>> view held by WHISHWORKS.
> 


-- 
The contents of this e-mail are confidential and for the exclusive use of 
the intended recipient. If you receive this e-mail in error please delete 
it from your system immediately and notify us either by e-mail or 
telephone. You should not copy, forward or otherwise disclose the content 
of the e-mail. The views expressed in this communication may not 
necessarily be the view held by WHISHWORKS.

Re: Issue with Ambari Metrics Collector - Distributed mode

Reply via email to