Ok, I got it. Because the spark applications are submitted by livy in cluster mode, if the application driver is scheduled on a node which doesn't know es host, it could not persist the metric into es.
Thanks, Lionel On Fri, Jan 18, 2019 at 10:09 AM 大鹏 <[email protected]> wrote: > > In other words, I did not set the IP of es server in /etc/hosts on the > yarn server, so the metric save request could not reach the es server and > the metric data was lost > > On 01/17/2019 16:52,Lionel Liu<[email protected]> > <[email protected]> wrote: > > I'm not sure about it, maybe I'm not clear of your situation. > IMO, if it's the configuration issue, all the metrics should be lost, not > like this. > > Thanks, > Lionel > > On Thu, Jan 17, 2019 at 11:00 AM 大鹏 <[email protected]> wrote: > > > > I think I have found the reason. It should be that I did not configure es > service IP on yarn server, so there is no metric data in es > On 01/15/2019 09:34,Lionel Liu<[email protected]> > <[email protected]> wrote: > > I think you need to check the logs of livy and spark applications. > In livy log, you can find how many jobs are submitted to spark cluster. > In spark cluster UI, you can get the actual jobs number, and how many of > them are success or not. > Furthermore, you can try to find the yarn log of the missing points, if the > job actually executed, to find the error message in the log, then we can > get more information about this case. > > Thanks, > Lionel > > On Mon, Jan 14, 2019 at 1:36 PM 大鹏 <[email protected]> wrote: > > > This is attachment > > On 01/14/2019 13:22,Lionel Liu<[email protected]> > <[email protected]> wrote: > > Hi DaPeng, > > Griffin reads your data, execute the rule steps on the data, then persist > the metrics. > If there's any exception like data can not find or execution error, the > rule step might fail, and the following steps will not success either, the > metrics is collected after the last step, thus there might be no metrics in > such a situation. All the exceptions will be logged in the application log, > you can find some information there. > I cannot see your attachment, I'm not sure what kind of job are you > running, what's the data like, is it accessible in every minute? > > Thanks, > Lionel > > On Mon, Jan 14, 2019 at 11:06 AM 大鹏 <[email protected]> wrote: > > > What is the strategy of ES to save metric?My task was executed every five > minutes, and part of the metric generated in the process of execution was > lost, which did not match the number of task execution. For this problem, > please see the red part in the attachment. > > > > >
