Re: Help using HBase with Flink 1.1.4

2017-01-18 Thread Flavio Pompermaier
Can you post the shading section of your pom please? On 19 Jan 2017 08:01, "Giuliano Caliari" wrote: > Hey guys, > > I have tried the shading trick but I can't get it to work. > I followed the documented steps and it builds but when I try to run the > newly built

Re: How to get help on ClassCastException when re-submitting a job

2017-01-18 Thread Giuliano Caliari
Hello, Yuri's description of the issue is spot on. We are running our cluster on YARN and using Avro for serialization, exactly as described. @Ufuk, I'm running my Cluster on YARN, 4 Task Managers with 2 slots each but this particular job has parallelism 1. @Yuri, I'll test your fix as soon

Re: Help using HBase with Flink 1.1.4

2017-01-18 Thread Giuliano Caliari
Hey guys, I have tried the shading trick but I can't get it to work. I followed the documented steps and it builds but when I try to run the newly built version it fails when trying to connect to the Resource Manager: 2017-01-17 00:42:05,872 INFO org.apache.flink.yarn.YarnClusterDescriptor

Register a user scope metric in window reduce function

2017-01-18 Thread tao xiao
Hi team, Is there any way that I can register a metric in a window reduce function? As per the flink doc getRuntimecontext is only available in RichFunction but window operator doesn't allow RichFunction to be chained. Any way to workaround this?

Flink SQL on JSON data without schema

2017-01-18 Thread Nihat Hosgur
Hi there, We are evaluating flink SQL to understand if it would be a better fit instead of Spark. So far we loved how natural it is to consume streams on Flink. We do read bunch of Kafka topics and like to join those streams and eventually run some SQL queries. We've used Kafka tables yet if I'm

Keep bootstrapped config updated with stream from Kafka

2017-01-18 Thread Nihat Hosgur
Hi all, We bootstrap data from some DB and then like to keep it updated with updates coming through Kafka. At spark it was kind of easy by utilizing through UpdateStateByKey yet I'm kind of stuck to figure out how to do it with Flink. I've taken a look into iterate yet I don't think it's meant to

Re: Window limitations on groupBy

2017-01-18 Thread Raman Gupta
Thank you for your reply. If I were to use a keyed stream with a count-based window of 2, would Flink keep the last state persistently until the next state is received? Would this be another way of having Flink keep this information persistently without having to implement it manually? Thanks,

Re: Kafka Fetch Failed / DisconnectException

2017-01-18 Thread Jonas
Hallo Fabian, that IS the error message. The job continues to run without restarting. There is not really more to see from the logs. -- Jonas -- View this message in context:

seeding a stream job

2017-01-18 Thread Jared Stehler
I have a use case where I need to start a stream replaying historical data, and then have it continue processing on a live kafka source, and am looking for guidance / best practices for implementation. Basically, I want to start up a new “version” of the stream job, and have it process each

ApacheCon CFP closing soon (11 February)

2017-01-18 Thread Rich Bowen
Hello, fellow Apache enthusiast. Thanks for your participation, and interest in, the projects of the Apache Software Foundation. I wanted to remind you that the Call For Papers (CFP) for ApacheCon North America, and Apache: Big Data North America, closes in less than a month. If you've been

Re: Kafka Fetch Failed / DisconnectException

2017-01-18 Thread Fabian Hueske
Hi Jonas, your mail did not include the error message. Can you send it again? Thanks, Fabian 2017-01-18 17:37 GMT+01:00 Jonas : > Hi! > > According to the output, I'm having some problems with the KafkaConsumer09. > It reports the following on stdout: > > > > Is that something

Kafka Fetch Failed / DisconnectException

2017-01-18 Thread Jonas
Hi! According to the output, I'm having some problems with the KafkaConsumer09. It reports the following on stdout: Is that something I should worry about? -- Jonas -- View this message in context:

Re: Kafka KeyedStream source

2017-01-18 Thread Fabian Hueske
Hi Niels, I was more talking from a theoretical point of view. Flink does not have a hook to inject a custom hash function (yet). I'm not familiar with the details of the implementation to make an assessment whether this would be possible or how much work it would be. However, several users have

Re: Window limitations on groupBy

2017-01-18 Thread Fabian Hueske
Hi Raman, I would approach this issues as follows. You key the input stream on the sourceId and apply a stateful FlatMapFunction. The FlatMapFunction has a key-partioned state and stores for each key (sourceId) the latest event as state. When a new event arrives, you can compute the time spend

Window limitations on groupBy

2017-01-18 Thread Raman Gupta
I am investigating Flink. I am considering a relatively simple use case -- I want to ingest streams of events that are essentially timestamped state changes. These events may look something like: { sourceId: 111, state: OPEN, timestamp: } I want to apply various processing to these state

Re: Strategies for Complex Event Processing with guaranteed data consistency

2017-01-18 Thread Fabian Hueske
Hi Kat, thanks for the clarification about cases and traces. Regarding the aggregation of traces: You can either do that in the same job that constructs the cases or in a job which is decoupled by for instance Kafka. If I got your requirements right, you need a mechanism for retraction. A case

Re: Possible JVM native memory leak

2017-01-18 Thread Stefan Richter
Hi, The answer to question one is clearly yes, and you can configure RocksDB through the DBOptions. Question two is obviously more tricky with the given information. But it is surely possible that some resources are not properly released. All classes from the RocksDB Java API have a safety

Re: How to get help on ClassCastException when re-submitting a job

2017-01-18 Thread Yury Ruchin
For my case I tracked down the culprit. It's been Avro indeed. I'm providing details below, since I believe the pattern is pretty common for such issues. In YARN setup there are several sources where classes are loaded from: Flink lib directory, YARN lib directories, user code. The first two

Re: Fault tolerance guarantees of Elasticsearch sink in flink-elasticsearch2?

2017-01-18 Thread Tzu-Li (Gordon) Tai
Hi Andrew! There’s nothing special about extending the checkpointing interfaces for the SinkFunction; for Flink they’re essentially user functions that have user state to be checkpointed. So yes, you’ll just implement is as you would for a flatMap / map / etc. function. Fell free to let me

Re: Rolling sink parquet/Avro output

2017-01-18 Thread elmosca
Hi Biswajit, We use the following Writer for Parquet using Avro conversion (using Scala): Using this library as dependency: "org.apache.parquet" % "parquet-avro" % "1.8.1". We use this writer in a rolling sink and seems fine so far. Cheers, Bruno -- View this message in context:

Re: Rolling sink parquet/Avro output

2017-01-18 Thread Bruno Aranda
Sorry, something went wrong with the code for the Writer. Here it is again: import org.apache.avro.Schema import org.apache.flink.streaming.connectors.fs.Writer import org.apache.hadoop.fs.{FileSystem, Path} import org.apache.parquet.avro.AvroParquetWriter import

Re: Apache Flink 1.1.4 - Java 8 - CommunityDetection.java:158 - java.lang.NullPointerException

2017-01-18 Thread Vasiliki Kalavri
Great! Let us know if you need help. -Vasia. On 17 January 2017 at 10:30, Miguel Coimbra wrote: > Hello Vasia, > > I am going to look into this. > Hopefully I will contribute to the implementation and documentation. > > Regards, > > -- Forwarded message

Re: Rolling sink parquet/Avro output

2017-01-18 Thread elmosca
Hi Biswajit, We use the following Writer for Parquet using Avro conversion (using Scala): Using this library as dependency: "org.apache.parquet" % "parquet-avro" % "1.8.1". We use this writer in a rolling sink and seems fine so far. Cheers, Bruno -- View this message in context:

Re: Zeppelin: Flink Kafka Connector

2017-01-18 Thread Fabian Hueske
Ah, OK :-) Thanks for reporting back! Cheers, Fabian 2017-01-17 17:50 GMT+01:00 Neil Derraugh < neil.derra...@intellifylearning.com>: > I re-read that enough times and it finally made sense. I wasn’t paying > attention and thought 0.10.2 was the Kafka version —which hasn’t been > released yet