Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-14 Thread bat man
Hello Arvid,

Thanks I’ll check my config and use the correct reporter and test it out.

Thanks,
Hemant

On Fri, 14 Aug 2020 at 6:57 PM, Arvid Heise  wrote:

> Hi Hemant,
>
> according to the influx section of the 1.9 metric documentation [1], you
> should use the reporter without a factory. The factory was added later.
>
> metrics.reporter.influxdb.class: 
> org.apache.flink.metrics.influxdb.InfluxdbReportermetrics.reporter.influxdb.host:
>  localhostmetrics.reporter.influxdb.port: 8086metrics.reporter.influxdb.db: 
> flinkmetrics.reporter.influxdb.username: 
> flink-metricsmetrics.reporter.influxdb.password: 
> qwertymetrics.reporter.influxdb.retentionPolicy: one_hour
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/metrics.html#influxdb-orgapacheflinkmetricsinfluxdbinfluxdbreporter
>
> On Thu, Aug 13, 2020 at 8:10 AM bat man  wrote:
>
>> Anyone who has made metrics integration to external systems for flink
>> running on AWS EMR, can you share if its a configuration issue or EMR
>> specific issue.
>>
>> Thanks,
>> Hemant
>>
>> On Wed, Aug 12, 2020 at 9:55 PM bat man  wrote:
>>
>>> An update in the yarn logs I could see the below -
>>>
>>> Classpath:
>>> *lib/flink-metrics-influxdb-1.9.0.jar:lib/flink-shaded-hadoop-2-uber-2.8.5-amzn-5-7.0.jar:lib/flink-table-blink_2.11-1.9.0.jar:lib/flink-table_2.11-1.9.0.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:plugins/influxdb/flink-metrics-influxdb-1.9.0.jar*
>>> *..*
>>> *..*
>>>
>>> This means the jar is getting loaded, in the logs I could also see -
>>> 2020-08-12 15:28:51,505 INFO
>>>  org.apache.flink.yarn.YarnTaskExecutorRunner  - Registered
>>> UNIX signal handlers for [TERM, HUP, I
>>> NT]
>>> 2020-08-12 15:28:51,508 INFO
>>>  org.apache.flink.yarn.YarnTaskExecutorRunner  - Current
>>> working Directory: /mnt/yarn/usercache/ha
>>>
>>> doop/appcache/application_1595767096609_0013/container_1595767096609_0013_01_04
>>>
>>> *2020-08-12 15:28:51,512 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.influxdb.interval, 60 SECONDS*
>>>
>>> *2020-08-12 15:28:51,512 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: env.yarn.conf.dir, /etc/hadoop/conf*
>>> 2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.
>>> influxdb.host, xx.xxx.xxx.xx
>>> 2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: high-availability
>>> .cluster-id, application_1595767096609_0013
>>> 2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.rpc.ad
>>> dress, ip-xx-x-xx-xxx.ec2.internal
>>> 2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.
>>> influxdb.password, **
>>>
>>> *2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: FLINK_PLUGINS_DIR, /usr/lib/flink/plugins*
>>> 2020-08-12 15:28:51,513 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.
>>> influxdb.db, xx
>>> 2020-08-12 15:28:51,520 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.
>>> influxdb.connectTimeout, 6
>>> 2020-08-12 15:28:51,520 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: env.hadoop.conf.d
>>> ir, /etc/hadoop/conf
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: taskmanager.numbe
>>> rOfTaskSlots, 1
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: web.port, 0
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: metrics.reporter.influxdb.username, 
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: taskmanager.memory.size, 264241152b
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: web.tmpdir,
>>> /tmp/flink-web-5562f065-6020-4c38-8260-3aea434bf285
>>> 2020-08-12 15:28:51,521 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.rpc.port, 32777
>>> 2020-08-12 15:28:51,521 INFO
>>>  

Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-14 Thread Arvid Heise
Hi Hemant,

according to the influx section of the 1.9 metric documentation [1], you
should use the reporter without a factory. The factory was added later.

metrics.reporter.influxdb.class:
org.apache.flink.metrics.influxdb.InfluxdbReportermetrics.reporter.influxdb.host:
localhostmetrics.reporter.influxdb.port:
8086metrics.reporter.influxdb.db:
flinkmetrics.reporter.influxdb.username:
flink-metricsmetrics.reporter.influxdb.password:
qwertymetrics.reporter.influxdb.retentionPolicy: one_hour


[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/metrics.html#influxdb-orgapacheflinkmetricsinfluxdbinfluxdbreporter

On Thu, Aug 13, 2020 at 8:10 AM bat man  wrote:

> Anyone who has made metrics integration to external systems for flink
> running on AWS EMR, can you share if its a configuration issue or EMR
> specific issue.
>
> Thanks,
> Hemant
>
> On Wed, Aug 12, 2020 at 9:55 PM bat man  wrote:
>
>> An update in the yarn logs I could see the below -
>>
>> Classpath:
>> *lib/flink-metrics-influxdb-1.9.0.jar:lib/flink-shaded-hadoop-2-uber-2.8.5-amzn-5-7.0.jar:lib/flink-table-blink_2.11-1.9.0.jar:lib/flink-table_2.11-1.9.0.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:plugins/influxdb/flink-metrics-influxdb-1.9.0.jar*
>> *..*
>> *..*
>>
>> This means the jar is getting loaded, in the logs I could also see -
>> 2020-08-12 15:28:51,505 INFO
>>  org.apache.flink.yarn.YarnTaskExecutorRunner  - Registered
>> UNIX signal handlers for [TERM, HUP, I
>> NT]
>> 2020-08-12 15:28:51,508 INFO
>>  org.apache.flink.yarn.YarnTaskExecutorRunner  - Current
>> working Directory: /mnt/yarn/usercache/ha
>>
>> doop/appcache/application_1595767096609_0013/container_1595767096609_0013_01_04
>>
>> *2020-08-12 15:28:51,512 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.influxdb.interval, 60 SECONDS*
>>
>> *2020-08-12 15:28:51,512 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: env.yarn.conf.dir, /etc/hadoop/conf*
>> 2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.
>> influxdb.host, xx.xxx.xxx.xx
>> 2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability
>> .cluster-id, application_1595767096609_0013
>> 2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.ad
>> dress, ip-xx-x-xx-xxx.ec2.internal
>> 2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.
>> influxdb.password, **
>>
>> *2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: FLINK_PLUGINS_DIR, /usr/lib/flink/plugins*
>> 2020-08-12 15:28:51,513 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.
>> influxdb.db, xx
>> 2020-08-12 15:28:51,520 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.
>> influxdb.connectTimeout, 6
>> 2020-08-12 15:28:51,520 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: env.hadoop.conf.d
>> ir, /etc/hadoop/conf
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.numbe
>> rOfTaskSlots, 1
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: web.port, 0
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.influxdb.username, 
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.memory.size, 264241152b
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: web.tmpdir,
>> /tmp/flink-web-5562f065-6020-4c38-8260-3aea434bf285
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.port, 32777
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: metrics.reporter.influxdb.port, 8086
>> 2020-08-12 15:28:51,521 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: 

Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-13 Thread bat man
Anyone who has made metrics integration to external systems for flink
running on AWS EMR, can you share if its a configuration issue or EMR
specific issue.

Thanks,
Hemant

On Wed, Aug 12, 2020 at 9:55 PM bat man  wrote:

> An update in the yarn logs I could see the below -
>
> Classpath:
> *lib/flink-metrics-influxdb-1.9.0.jar:lib/flink-shaded-hadoop-2-uber-2.8.5-amzn-5-7.0.jar:lib/flink-table-blink_2.11-1.9.0.jar:lib/flink-table_2.11-1.9.0.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:plugins/influxdb/flink-metrics-influxdb-1.9.0.jar*
> *..*
> *..*
>
> This means the jar is getting loaded, in the logs I could also see -
> 2020-08-12 15:28:51,505 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner
>  - Registered UNIX signal handlers for [TERM, HUP, I
> NT]
> 2020-08-12 15:28:51,508 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner
>  - Current working Directory: /mnt/yarn/usercache/ha
>
> doop/appcache/application_1595767096609_0013/container_1595767096609_0013_01_04
>
> *2020-08-12 15:28:51,512 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.interval, 60 SECONDS*
>
> *2020-08-12 15:28:51,512 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: env.yarn.conf.dir, /etc/hadoop/conf*
> 2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.
> influxdb.host, xx.xxx.xxx.xx
> 2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability
> .cluster-id, application_1595767096609_0013
> 2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.ad
> dress, ip-xx-x-xx-xxx.ec2.internal
> 2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.
> influxdb.password, **
>
> *2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: FLINK_PLUGINS_DIR, /usr/lib/flink/plugins*
> 2020-08-12 15:28:51,513 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.
> influxdb.db, xx
> 2020-08-12 15:28:51,520 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.
> influxdb.connectTimeout, 6
> 2020-08-12 15:28:51,520 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: env.hadoop.conf.d
> ir, /etc/hadoop/conf
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.numbe
> rOfTaskSlots, 1
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: web.port, 0
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.username, 
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.memory.size, 264241152b
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: web.tmpdir,
> /tmp/flink-web-5562f065-6020-4c38-8260-3aea434bf285
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.port, 32777
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.port, 8086
> 2020-08-12 15:28:51,521 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.retentionPolicy, one_hour
> 2020-08-12 15:28:51,522 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: internal.cluster.execution-mode, NORMAL
> 2020-08-12 15:28:51,522 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.writeTimeout, 6
> 2020-08-12 15:28:51,522 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: metrics.reporter.influxdb.consistency, ONE
> 2020-08-12 15:28:51,522 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: rest.address, ip-xx-x-xx-xxx.ec2.internal
> *2020-08-12 15:28:51,522 INFO
>  

Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-12 Thread bat man
An update in the yarn logs I could see the below -

Classpath:
*lib/flink-metrics-influxdb-1.9.0.jar:lib/flink-shaded-hadoop-2-uber-2.8.5-amzn-5-7.0.jar:lib/flink-table-blink_2.11-1.9.0.jar:lib/flink-table_2.11-1.9.0.jar:lib/log4j-1.2.17.jar:lib/slf4j-log4j12-1.7.15.jar:log4j.properties:plugins/influxdb/flink-metrics-influxdb-1.9.0.jar*
*..*
*..*

This means the jar is getting loaded, in the logs I could also see -
2020-08-12 15:28:51,505 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner
 - Registered UNIX signal handlers for [TERM, HUP, I
NT]
2020-08-12 15:28:51,508 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner
 - Current working Directory: /mnt/yarn/usercache/ha
doop/appcache/application_1595767096609_0013/container_1595767096609_0013_01_04

*2020-08-12 15:28:51,512 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.interval, 60 SECONDS*

*2020-08-12 15:28:51,512 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: env.yarn.conf.dir, /etc/hadoop/conf*
2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.
influxdb.host, xx.xxx.xxx.xx
2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability
.cluster-id, application_1595767096609_0013
2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.ad
dress, ip-xx-x-xx-xxx.ec2.internal
2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.
influxdb.password, **

*2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: FLINK_PLUGINS_DIR, /usr/lib/flink/plugins*
2020-08-12 15:28:51,513 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.
influxdb.db, xx
2020-08-12 15:28:51,520 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.
influxdb.connectTimeout, 6
2020-08-12 15:28:51,520 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: env.hadoop.conf.d
ir, /etc/hadoop/conf
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.numbe
rOfTaskSlots, 1
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: web.port, 0
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.username, 
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.memory.size, 264241152b
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: web.tmpdir,
/tmp/flink-web-5562f065-6020-4c38-8260-3aea434bf285
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.port, 32777
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.port, 8086
2020-08-12 15:28:51,521 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.retentionPolicy, one_hour
2020-08-12 15:28:51,522 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: internal.cluster.execution-mode, NORMAL
2020-08-12 15:28:51,522 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.writeTimeout, 6
2020-08-12 15:28:51,522 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.consistency, ONE
2020-08-12 15:28:51,522 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: rest.address, ip-xx-x-xx-xxx.ec2.internal
*2020-08-12 15:28:51,522 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: metrics.reporter.influxdb.factory.class,
org.apache.flink.metrics.influxdb.InfluxdbReporterFactory*
...
but then below I could see -

*2020-08-12 15:28:51,523 WARN  org.apache.flink.core.plugin.PluginConfig
  - Environment variable [FLINK_PLUGINS_DIR] is set to
[/usr/lib/flink/plugins] but the directory doesn't exist*

Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-12 Thread bat man
Hello Experts,

I am running Flink - 1.9.0 on AWS EMR(emr-5.28.1). I want to push
metrics to Influxdb. I followed the documentation[1]. I added the
configuration to /usr/lib/flink/conf/flink-conf.yaml and copied the jar to
/usr/lib/flink//lib folder on master node. However, I also understand that
the cluster might need a re-start as only with these steps when I run the
job I don't see any measurement(table) created in my influx db. I am not
able to find any documentation on how to restart the cluster on EMR.
Anyone who has configured to push metrics to InfluxDB from AWS EMR could
you share the steps please.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/metrics.html#influxdb-orgapacheflinkmetricsinfluxdbinfluxdbreporter

Thanks,
Hemant