In other words, I did not set the IP of es server in /etc/hosts on the yarn server, so the metric save request could not reach the es server and the metric data was lost
On 01/17/2019 16:52,Lionel Liu<[email protected]> wrote: I'm not sure about it, maybe I'm not clear of your situation. IMO, if it's the configuration issue, all the metrics should be lost, not like this. Thanks, Lionel On Thu, Jan 17, 2019 at 11:00 AM 大鹏 <[email protected]> wrote: I think I have found the reason. It should be that I did not configure es service IP on yarn server, so there is no metric data in es On 01/15/2019 09:34,Lionel Liu<[email protected]> <[email protected]> wrote: I think you need to check the logs of livy and spark applications. In livy log, you can find how many jobs are submitted to spark cluster. In spark cluster UI, you can get the actual jobs number, and how many of them are success or not. Furthermore, you can try to find the yarn log of the missing points, if the job actually executed, to find the error message in the log, then we can get more information about this case. Thanks, Lionel On Mon, Jan 14, 2019 at 1:36 PM 大鹏 <[email protected]> wrote: This is attachment On 01/14/2019 13:22,Lionel Liu<[email protected]> <[email protected]> wrote: Hi DaPeng, Griffin reads your data, execute the rule steps on the data, then persist the metrics. If there's any exception like data can not find or execution error, the rule step might fail, and the following steps will not success either, the metrics is collected after the last step, thus there might be no metrics in such a situation. All the exceptions will be logged in the application log, you can find some information there. I cannot see your attachment, I'm not sure what kind of job are you running, what's the data like, is it accessible in every minute? Thanks, Lionel On Mon, Jan 14, 2019 at 11:06 AM 大鹏 <[email protected]> wrote: What is the strategy of ES to save metric?My task was executed every five minutes, and part of the metric generated in the process of execution was lost, which did not match the number of task execution. For this problem, please see the red part in the attachment.
