Re: The ES metric data loss problem

Lionel Liu Mon, 21 Jan 2019 02:37:52 -0800

Ok, I got it.
Because the spark applications are submitted by livy in cluster mode, if
the application driver is scheduled on a node which doesn't know es host,
it could not persist the metric into es.


Thanks,
Lionel

On Fri, Jan 18, 2019 at 10:09 AM 大鹏 <[email protected]> wrote:

>
> In other words, I did not set the IP of es server in /etc/hosts on the
> yarn server, so the metric save request could not reach the es server and
> the metric data was lost
>
> On 01/17/2019 16:52，Lionel Liu<[email protected]>
> <[email protected]> wrote：
>
> I'm not sure about it, maybe I'm not clear of your situation.
> IMO, if it's the configuration issue, all the metrics should be lost, not
> like this.
>
> Thanks,
> Lionel
>
> On Thu, Jan 17, 2019 at 11:00 AM 大鹏 <[email protected]> wrote:
>
>
>
> I think I have found the reason. It should be that I did not configure es
> service IP on yarn server, so there is no metric data in es
> On 01/15/2019 09:34，Lionel Liu<[email protected]>
> <[email protected]> wrote：
>
> I think you need to check the logs of livy and spark applications.
> In livy log, you can find how many jobs are submitted to spark cluster.
> In spark cluster UI, you can get the actual jobs number, and how many of
> them are success or not.
> Furthermore, you can try to find the yarn log of the missing points, if the
> job actually executed, to find the error message in the log, then we can
> get more information about this case.
>
> Thanks,
> Lionel
>
> On Mon, Jan 14, 2019 at 1:36 PM 大鹏 <[email protected]> wrote:
>
>
> This is attachment
>
> On 01/14/2019 13:22，Lionel Liu<[email protected]>
> <[email protected]> wrote：
>
> Hi DaPeng,
>
> Griffin reads your data, execute the rule steps on the data, then persist
> the metrics.
> If there's any exception like data can not find or execution error, the
> rule step might fail, and the following steps will not success either, the
> metrics is collected after the last step, thus there might be no metrics in
> such a situation. All the exceptions will be logged in the application log,
> you can find some information there.
> I cannot see your attachment, I'm not sure what kind of job are you
> running, what's the data like, is it accessible in every minute?
>
> Thanks,
> Lionel
>
> On Mon, Jan 14, 2019 at 11:06 AM 大鹏 <[email protected]> wrote:
>
>
> What is the strategy of ES to save metric?My task was executed every five
> minutes, and part of the metric generated in the process of execution was
> lost, which did not match the number of task execution. For this problem,
> please see the red part in the attachment.
>
>
>
>
>

Re: The ES metric data loss problem

Reply via email to