[jira] [Commented] (FLINK-7198) Datadog Metric Reporter reports incorrect host for JobManager

2017-07-16 Thread Robert Batts (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089051#comment-16089051
 ] 

Robert Batts commented on FLINK-7198:
-

Update, you can workaround this by setting jobmanager.rpc.address, but I'm not 
sure if this should be default behavior or not for metrics reporting. So I'll 
leave this open for now.

> Datadog Metric Reporter reports incorrect host for JobManager
> -
>
> Key: FLINK-7198
> URL: https://issues.apache.org/jira/browse/FLINK-7198
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.3.1
> Environment: RHEL 7.3, Mesos 1.3, Datadog
>Reporter: Robert Batts
>Priority: Minor
>
> When using the Datadog Metric Reporter with a Mesos deployed Flink 1.3.1 
> cluster the JobManager is reported to Datadog with a tag of host:127.0.0.1. 
> The TaskManagers report with the correct tag (i.e. host:mesos-02.place.com), 
> so this just appears to be an issue with the way host information is gathered 
> for Datadog.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7200) Make metrics more Datadog friendly

2017-07-16 Thread Robert Batts (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Batts closed FLINK-7200.
---
Resolution: Information Provided

> Make metrics more Datadog friendly
> --
>
> Key: FLINK-7200
> URL: https://issues.apache.org/jira/browse/FLINK-7200
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.3.1
>Reporter: Robert Batts
>Priority: Minor
>
> The current output of the Datadog Reporter is a little unfriendly to the 
> platform they are going to from a metrics name perspective. Take for example 
> the metric used reporting with the Datadog Kafka integration.
> kafka.consumer_lag= [topic:, consumer_group: , partition: ]
> Through the use of tags (in this case topic, consumer_group, and partition) 
> you can create graphs in Datadog filtered to a specific topic and 
> consumer_group and then averaged on each partition. This allows you to 
> visualize something like a heatmap for lag on each partition for a consumer.
> So what am I suggesting for Flink? Currently, I think the tags for Datadog 
> are in a great place. Tags like job_id and subtask_id would be great for 
> filtering and grouping. But, the metric name is currently too specific to a 
> taskmanager and subtask. Currently, the metrics look something like this:
> flink_w04.taskmanager.4f378aff5730.TwitterExample.ExtractHashtags.7.numRecordsOut
> {host}.taskmanager.{tm_id}.{job_name}.{operator_name}.{subtask_index}.{metric_name}
> What I am suggesting is something more like this:
> taskmanager.TwitterExample.ExtractHashtags.numRecordsOut
> taskmanager.{job_name}.{operator_name}.{metric_name}
> (or even taskmanager.{metric_name}, but that would be a lot of tags on a 
> single metric)
> By doing this someone could create a graph on the numRecordsOut for an entire 
> task's metric with a single metric in Datadog rather than combining the 
> metric for every subtask_index using the tm_id metric (that could change if a 
> tm_id dropped out of the cluster.) Additionally, given the current set of 
> tags being output to Datadog there is a ton of grouping and filtering that 
> will be available if everything was on a simplified metric.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7200) Make metrics more Datadog friendly

2017-07-16 Thread Robert Batts (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089048#comment-16089048
 ] 

Robert Batts commented on FLINK-7200:
-

You're absolutely right. This is what I get for opening Jira on a Friday.

> Make metrics more Datadog friendly
> --
>
> Key: FLINK-7200
> URL: https://issues.apache.org/jira/browse/FLINK-7200
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.3.1
>Reporter: Robert Batts
>Priority: Minor
>
> The current output of the Datadog Reporter is a little unfriendly to the 
> platform they are going to from a metrics name perspective. Take for example 
> the metric used reporting with the Datadog Kafka integration.
> kafka.consumer_lag= [topic:, consumer_group: , partition: ]
> Through the use of tags (in this case topic, consumer_group, and partition) 
> you can create graphs in Datadog filtered to a specific topic and 
> consumer_group and then averaged on each partition. This allows you to 
> visualize something like a heatmap for lag on each partition for a consumer.
> So what am I suggesting for Flink? Currently, I think the tags for Datadog 
> are in a great place. Tags like job_id and subtask_id would be great for 
> filtering and grouping. But, the metric name is currently too specific to a 
> taskmanager and subtask. Currently, the metrics look something like this:
> flink_w04.taskmanager.4f378aff5730.TwitterExample.ExtractHashtags.7.numRecordsOut
> {host}.taskmanager.{tm_id}.{job_name}.{operator_name}.{subtask_index}.{metric_name}
> What I am suggesting is something more like this:
> taskmanager.TwitterExample.ExtractHashtags.numRecordsOut
> taskmanager.{job_name}.{operator_name}.{metric_name}
> (or even taskmanager.{metric_name}, but that would be a lot of tags on a 
> single metric)
> By doing this someone could create a graph on the numRecordsOut for an entire 
> task's metric with a single metric in Datadog rather than combining the 
> metric for every subtask_index using the tm_id metric (that could change if a 
> tm_id dropped out of the cluster.) Additionally, given the current set of 
> tags being output to Datadog there is a ton of grouping and filtering that 
> will be available if everything was on a simplified metric.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7200) Make metrics more Datadog friendly

2017-07-14 Thread Robert Batts (JIRA)
Robert Batts created FLINK-7200:
---

 Summary: Make metrics more Datadog friendly
 Key: FLINK-7200
 URL: https://issues.apache.org/jira/browse/FLINK-7200
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.3.1
Reporter: Robert Batts
Priority: Minor


The current output of the Datadog Reporter is a little unfriendly to the 
platform they are going to from a metrics name perspective. Take for example 
the metric used reporting with the Datadog Kafka integration.

kafka.consumer_lag= [topic:, consumer_group: , partition: ]

Through the use of tags (in this case topic, consumer_group, and partition) you 
can create graphs in Datadog filtered to a specific topic and consumer_group 
and then averaged on each partition. This allows you to visualize something 
like a heatmap for lag on each partition for a consumer.

So what am I suggesting for Flink? Currently, I think the tags for Datadog are 
in a great place. Tags like job_id and subtask_id would be great for filtering 
and grouping. But, the metric name is currently too specific to a taskmanager 
and subtask. Currently, the metrics look something like this:

flink_w04.taskmanager.4f378aff5730.TwitterExample.ExtractHashtags.7.numRecordsOut
{host}.taskmanager.{tm_id}.{job_name}.{operator_name}.{subtask_index}.{metric_name}

What I am suggesting is something more like this:

taskmanager.TwitterExample.ExtractHashtags.numRecordsOut
taskmanager.{job_name}.{operator_name}.{metric_name}
(or even taskmanager.{metric_name}, but that would be a lot of tags on a single 
metric)

By doing this someone could create a graph on the numRecordsOut for an entire 
task's metric with a single metric in Datadog rather than combining the metric 
for every subtask_index using the tm_id metric (that could change if a tm_id 
dropped out of the cluster.) Additionally, given the current set of tags being 
output to Datadog there is a ton of grouping and filtering that will be 
available if everything was on a simplified metric.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7198) Datadog Metric Reporter reports incorrect host for JobManager

2017-07-14 Thread Robert Batts (JIRA)
Robert Batts created FLINK-7198:
---

 Summary: Datadog Metric Reporter reports incorrect host for 
JobManager
 Key: FLINK-7198
 URL: https://issues.apache.org/jira/browse/FLINK-7198
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.3.1
 Environment: RHEL 7.3, Mesos 1.3, Datadog
Reporter: Robert Batts
Priority: Minor


When using the Datadog Metric Reporter with a Mesos deployed Flink 1.3.1 
cluster the JobManager is reported to Datadog with a tag of host:127.0.0.1. The 
TaskManagers report with the correct tag (i.e. host:mesos-02.place.com), so 
this just appears to be an issue with the way host information is gathered for 
Datadog.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Robert Batts (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Batts updated FLINK-3769:

Description: 
The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.

Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html

  was:
The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.


> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-04-15 Thread Robert Batts (JIRA)
Robert Batts created FLINK-3769:
---

 Summary: RabbitMQ Sink ability to publish to a different exchange
 Key: FLINK-3769
 URL: https://issues.apache.org/jira/browse/FLINK-3769
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Affects Versions: 1.0.1
Reporter: Robert Batts


The RabbitMQ Sink can currently only publish to the "default" exchange. This 
exchange is a direct exchange, so the routing key will route directly to the 
queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
to 1 exchange which routes to 1 queue). Additionally, I believe that if a user 
decides to use a different exchange I think the following can be assumed:

1.) The provided exchange exists
2.) The user has declared the appropriate mapping and the appropriate queues 
exist in RabbitMQ (therefore, nothing needs to be created)

RabbitMQ currently provides four types of exchanges. Three of these will be 
covered by just enabling exchanges (Direct, Fanout, Topic) because they use the 
routingkey (or nothing). 

The fourth exchange type relies on the message headers, which are currently set 
to null by default on the publish. These headers may be on a per message level, 
so the input of this stream will need to take this as input as well. This forth 
exchange could very well be outside of the scope of this Improvement and a 
"RabbitMQ Sink enable headers" Improvement might be the better way to go with 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3763) RabbitMQ Source/Sink standardize connection parameters

2016-04-14 Thread Robert Batts (JIRA)
Robert Batts created FLINK-3763:
---

 Summary: RabbitMQ Source/Sink standardize connection parameters
 Key: FLINK-3763
 URL: https://issues.apache.org/jira/browse/FLINK-3763
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Affects Versions: 1.0.1
Reporter: Robert Batts


The RabbitMQ source and sink should have the same capabilities in terms of 
establishing a connection, currently the sink is lacking connection parameters 
that are available on the source. Additionally, VirtualHost should be an 
offered parameter for multi-tenant RabbitMQ clusters (if not specified it goes 
to the vhost '/').

Connection Parameters
===
- Host - Offered on both
- Port - Source only
- Virtual Host - Neither
- User - Source only
- Password - Source only

Additionally, it might be worth offer the URI as a valid constructor because 
that would offer all 5 of the above parameters in a single String.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)