Kinesis Producer in 1.4.2: testing locally with Kinesalite not working

2018-07-22 Thread Philipp Bussche
Hi there, when trying to use a KinesisProducer which has both aws.region and aws.endpoint set I am receiving the following error message: 07/22/2018 19:52:42 Source: EventStream -> (Sink: Unnamed, EventAndVenueMap -> (Filter, Sink: EventToInventorySink), Sink: ElasticSearchEventsSink)(4/8) sw

Re: Sink metric numRecordsIn drops temporarily

2017-09-17 Thread Philipp Bussche
Hi, thank you for your answer. So for September 11th which is shown on the screenshot I had the counter sitting at 26.91k where when the drop happened it was going down to 26.01k. This happened 3 times during that day and it was always going back to the same value. Philipp -- Sent from: http://a

Sink metric numRecordsIn drops temporarily

2017-09-17 Thread Philipp Bussche
Hi there, I witnessed an interesting behaviour on my Grafana dashboard where sink related metrics would show activity where there should not be any. I am saying this because for this particular sink activity is triggered by data being ingested through a cronjob at a particular time, however the das

Async I/O & async-http-client & Netty

2017-05-05 Thread Philipp Bussche
Hi there, just wanted to let you guys know that I was playing around with Async I/O and async-http-client and there seems to be an incompatibility somewhere on Netty. I wanted to use an async function to enrich my data with results coming from Foursquare's API and it works fine when adding the belo

Graphite reporter recover from broken pipe

2017-02-01 Thread Philipp Bussche
Hi there, after moving my graphite service to another host my task manager does not recover monitoring and continues to complain about a broken pipe issue. It sounds a bit like this one: https://github.com/dropwizard/metrics/pull/1036 What do I need to do to update dropwizard to a 3.1.x version to

Re: Subtask keeps on discovering new Kinesis shard when using Kinesalite

2016-11-18 Thread Philipp Bussche
Hello Gordon thank you for the patch. I can confirm that discovery looks good now and it does not re discover shards every few seconds. I will do more testing with this now but it looks very good already ! Thanks, Philipp -- View this message in context: http://apache-flink-user-mailing-list-

Re: Why use Kafka after all?

2016-11-16 Thread Philipp Bussche
Hi Dromit I started using Flink with Kafka but am currently looking into Kinesis to replace Kafka. The reason behind this is that eventually my application will run in somebody's cloud and if I go for AWS then I don't have to take care of operating Kafka and Zookeeper myself. I understand this can

Re: Subtask keeps on discovering new Kinesis shard when using Kinesalite

2016-11-16 Thread Philipp Bussche
Hello Gordon, thank you for your help. I have set the discovery interval to 30 seconds and just starting the job on a clean kinesalite service (I am running it inside docker so every time the container gets stopped and removed to start from scratch). This is the output without actually any data i

Subtask keeps on discovering new Kinesis shard when using Kinesalite

2016-11-15 Thread Philipp Bussche
Hi there, I am looking into AWS Kinesis and wanted to test with a local install of Kinesalite. This is on the Flink 1.2-SNAPSHOT. However it looks like my subtask keeps on discovering new shards indicated by the following log messages which is constantly written: 21:45:42,867 INFO org.apache.flin

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-02 Thread Philipp Bussche
Hi there, I am using Graphite and querying it in Grafana is super easy. You just select fields and they come up automatically for you to select from depending on how your metric structure in Graphite looks like. You can also use wildcards. The only thing I had to do because I am also using containe

Re: Elasticsearch Http Connector

2016-10-26 Thread Philipp Bussche
Hi, yes I wrote an ElasticSearch Sink for Flink that uses Jest. Sure, I will make something available for you. Just travelling at the moment so this will be a few hours before it is there. Thanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.co

Re: Elasticsearch Http Connector

2016-10-25 Thread Philipp Bussche
Hi there, not me (which I guess is not what you wanted to hear) but I had to implement a custom ElasticSearch based on Jest to be able to sink data into ES on AWS. Works quite alright ! Philipp https://github.com/searchbox-io/Jest/tree/master/jest -- View this message in context: http://apach

Re: Task and Operator Monitoring via JMX / naming

2016-10-20 Thread Philipp Bussche
Thanks Chesnay ! I am not too familiar with the release cycles here but was wondering when one could expect your fix to be in the master of Flink ? Should I create my own build for the moment maybe ? Thanks. -- View this message in context: http://apache-flink-user-mailing-list-archive.233605

Re: Task and Operator Monitoring via JMX / naming

2016-10-20 Thread Philipp Bussche
Thanks Chesnay, I am happy to share more around my environment and do additional testing for this. Also I would be happy to help fixing if we see there might be an issue in the code somewhere. In fact I am still trying to get a Hacktoberfest T-Shirt and I am still pull requests short ;) -- Vie

Re: Task and Operator Monitoring via JMX / naming

2016-10-19 Thread Philipp Bussche
Some further observations: I had a Job which was taking events of a Kafka topic and sending it to two sinks whereas for one of them a Map operation would happen first. When creating one event stream and sending it to the two sinks the JMX representation was not showing both sinks and the naming of

Re: Task and Operator Monitoring via JMX / naming

2016-10-17 Thread Philipp Bussche
Thanks Chesnay. I had a look at how the JMX representation looks like when I look at a Task Manager which has one of the example Jobs deployed (https://ci.apache.org/projects/flink/flink-docs-release-1.1/quickstart/run_example_quickstart.html) and this looks correct. I assume at this point that th

Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Philipp Bussche
Thanks Chesnay, this is on Flink 1.1.3 Please also note that e.g. the first item in the list which has the custom metric attached to it starts with a leading "(". It might be that the parsing of the names is not working quite as expected. I was trying to find out where these names come from but was

Task and Operator Monitoring via JMX / naming

2016-10-14 Thread Philipp Bussche
Hi there, I am struggeling to understand what I am looking at after enabling JMX metric reporting on my taskmanager. The job I am trying this out with has 1 source, 2 map functions (where one is a RichMap) and 3 sinks. This is how I have written my Job: DataStream invitations = streaming

Re: Enriching a tuple mapped from a datastream with data coming from a JDBC source

2016-10-04 Thread Philipp Bussche
Awesome, thanks Fabian ! I will give this a try. Fabian Hueske-2 wrote > Hi Philipp, > > If I got your requirements right you would like to: > 1) load an initial hashmap via JDBC > 2) update the hashmap from a stream > 3) use the hashmap to enrich another stream. > > You can use a CoFlatMap to

Re: Enriching a tuple mapped from a datastream with data coming from a JDBC source

2016-10-03 Thread Philipp Bussche
Hi again, I implemented the RichMap Function (open method runs a JDBC query to populate a HashMap with data) which I am using in the map function. Now there is another RichMap.map function that would add to the HashMap that was initialized in the first function. How would I share the Map between th

Re: Enriching a tuple mapped from a datastream with data coming from a JDBC source

2016-09-12 Thread Philipp Bussche
Thank you Konstantin, the amount of data I have to load into memory will be very small so that should be alright. When opening and querying the database would I use any sort of Flink magic or just do plain JDBC ? I read about the JDBCInput concept which one could use with the DataSet API and was wo

Enriching a tuple mapped from a datastream with data coming from a JDBC source

2016-09-11 Thread Philipp Bussche
Hi there, I have a data stream (coming from Kafka) that contains information which I want to enrich with information that sits in a database before I handover the enriched tuple to a sink. How would I do that ? I was thinking of somehow combining my streaming job with a JDBC input but wasn't very s