Re: FlinkKafkaConsumer010 does not start from the next record on startup from offsets in Kafka

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi Robert, Uncaught exceptions that cause the job to fall into a fail-and-restart loop is likewise to the corrupt record case I mentioned. With exactly-once guarantees, the job will roll back to the last complete checkpoint, which "resets" the Flink consumer to some earlier Kafka partition offset

Bad entry in block exception with RocksDB

2017-11-22 Thread Kien Truong
Hi, We are seeing this exception in one of our job, whenever a check point or save point is performed. java.lang.RuntimeException: Error while adding data to RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.add(RocksDBListState.java:119) at org.apache.flink.runtime.state.Us

Re: Flink stress testing and metrics

2017-11-22 Thread Ladhari Sadok
Normally it should return 0ms in case of no latency not NaN, and my real data size is 1kb, but for now I'm using 200 bytes, I will try it with the real size later. For the data generator, it is an infinite for loop. Thanks. 2017-11-22 18:11 GMT+01:00 Timo Walther : > At a first glance I would s

Re: How to write dataset as parquet format

2017-11-22 Thread Flavio Pompermaier
I usually refer to this: https://github.com/FelixNeutatz/parquet-flinktacular On 22 Nov 2017 18:29, "Fabian Hueske" wrote: > Hi Ebru, > > AvroParquetOutputFormat seems to implement Hadoop's OutputFormat interface. > Flink provides a wrapper for Hadoop's OutputFormat [1], so you can try to > wra

Re: How to write dataset as parquet format

2017-11-22 Thread Fabian Hueske
Hi Ebru, AvroParquetOutputFormat seems to implement Hadoop's OutputFormat interface. Flink provides a wrapper for Hadoop's OutputFormat [1], so you can try to wrap AvroParquetOutputFormat in Flink's HadoopOutputFormat. Hope this helps, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-r

Re: S3 Access in eu-central-1

2017-11-22 Thread Timo Walther
@Patrick: Do you have an advice? Am 11/22/17 um 5:52 PM schrieb domi...@dbruhn.de: Hey everyone, I'm trying since hours to get Flink 1.3.2 (downloaded for hadoop 2.7) to snapshot/checkpoint to an S3 bucket which is hosted in the eu-central-1 region. Everything works fine for other regions. I'

Re: Flink stress testing and metrics

2017-11-22 Thread Timo Walther
At a first glance I would say that your data size is very small. Flink is able to process millions of records on a single machine. It might be that the records are produced to quickly to be used for latency measuring. Is you data generator never-ending? Am 11/22/17 um 4:13 PM schrieb Ladhari

S3 Access in eu-central-1

2017-11-22 Thread dominik
Hey everyone, I'm trying since hours to get Flink 1.3.2 (downloaded for hadoop 2.7) to snapshot/checkpoint to an S3 bucket which is hosted in the eu-central-1 region. Everything works fine for other regions. I'm running my job on a JobTracker in local mode. I googled the internet and found seve

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-22 Thread Gordon Weakliem
Jared has a good point, what is mvn dependency:tree showing? On Wed, Nov 22, 2017 at 7:54 AM, Jared Stehler < jared.steh...@intellifylearning.com> wrote: > Protobuf is notorious for throwing things like “class not found” when > built and run with different versions of the library; I believe flink

Re: Flink stress testing and metrics

2017-11-22 Thread Ladhari Sadok
Thanks Timo for your answer. I have tried to setLatencyTrackingInterval(1000) but I have got the same result ( latency : NaN ) My Flink Job is a geofencing pattern : - [Latitude,Langitude ] < IN | OUT > Location ? Send Notification : None In my stress test I'm using data that always send no

Re: Correlation between data streams/operators and threads

2017-11-22 Thread Nico Kruber
Hi Shailesh, your JobManager log suggests that this same JVM instance actually contains a TaskManager as well (sorry for not noticing earlier). Also this time, there is nothing regarding the BlobServer/BlobCache, but it looks like the task manager may think the jobmanager is down. Can you try wi

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-22 Thread Jared Stehler
Protobuf is notorious for throwing things like “class not found” when built and run with different versions of the library; I believe flink is using protobuf 2.5.0 and you mentioned using 2.6.1, which I think would be a possible cause of this issue. -- Jared Stehler Chief Architect - Intellify Lea

How to write dataset as parquet format

2017-11-22 Thread ebru
Hello all, We are trying to write dataset as parquet format, we use AvroParquetOutputFormat but it is not compatible with Flink’s FileOutputFormat. Is there a way to write dataset as parquet? -Ebru

Re: FlinkKafkaConsumer010 does not start from the next record on startup from offsets in Kafka

2017-11-22 Thread r. r.
Thanks Gordon But what if there is an uncaught exception in processing of the record (during normal job execution, after deserialization)? After the restart strategy exceeds the failure rate, the job will fail and on re-run it would start at the same offset, right? Is there a way to avoid this an

Re: How to Create Sample Data from HDFS File using Flink ?

2017-11-22 Thread Timo Walther
Hi, the sampling functions are exposed in org.apache.flink.api.java.utils.DataSetUtils. So you can basically can create something like: final HadoopInputFormat inputFormat = HadoopInputs.readHadoopFile(new TextInputFormat(), LongWritable.class, Text.class, hdfsPath); final DataSet> input

Re: Impersonate user for hdfs

2017-11-22 Thread Timo Walther
Hi Vishal, shouldn't it be possible to configure a proxy user via core-site.xml? Flink is also using this XML for HDFS. You can also set the configuration files manually, see https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/config.html#hdfs Regards, Timo Am 11/21/17 um 3:

Re: Flink stress testing and metrics

2017-11-22 Thread Timo Walther
Hi Sadok, it would be helpful if you could tell us a bit more about your job. E.g. a skewed key distribution where keys are only sent to one third of your operators can not use your CPUs full capabilities. The latency tracking interval is in milliseconds. Can you try if 1000 would fix your p

Re: Kafka consumer to sync topics by event time?

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi! The FlinkKafkaConsumer can handle watermark advancement with per-Kafka-partition awareness (across partitions of different topics). You can see an example of how to do that here [1]. Basically what this does is that it generates watermarks within the Kafka consumer individually for each Kafka

Re: FlinkKafkaConsumer010 does not start from the next record on startup from offsets in Kafka

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi Robert, As expected with exactly-once guarantees, a record that caused a Flink job to fail will be attempted to be reprocessed on the restart of the job. For some specific "corrupt" record that causes the job to fall into a fail-and-restart loop, there is a way to let the Kafka consumer skip t

Re: Kafka consumer to sync topics by event time?

2017-11-22 Thread Kien Truong
Hi, When you join multiple stream with different watermarks, the resulting stream's watermark will be the smallest of the input watermark, as long as you don't explicitly assign a new watermarks generator. In your example, if small_topic has watermark at time t1, big_topic has watermark at

Re: Flink REST API async?

2017-11-22 Thread Francisco Gonzalez
Sorry guys, in the previous message, when I talked about the task managers performance, I meant *Jobmanager* performance Francisco Gonzalez wrote > Hi guys, > > After investigating a bit more about this topic, we found a solution > adding > a small change in the Flink-1.3.2 source code. > > W

Re: Flink REST API async?

2017-11-22 Thread Francisco Gonzalez
Hi guys, After investigating a bit more about this topic, we found a solution adding a small change in the Flink-1.3.2 source code. We found that the issue occurred when different threads tried to build the Tuple2 object at the same time (due to they use the static ExecutionEnvironmnet variable

Re: Avoiding Dynamic Classloading

2017-11-22 Thread Aljoscha Krettek
Hi, Yes, if I remember correctly, this was changed in 1.2 to always include the user-jar in the system classloader on YARN. With Flink 1.4 we are changing the user-code classloader to load classes from the user-jar first (child-first classloading) by default so a lot of the comments on avoiding

Re: Tooling for resuming from checkpoints

2017-11-22 Thread Timo Walther
Hi Dominik, the Web UI shows you the status of a checkpoint [0], so it might be possible to retrieve the information via REST calls. Usually, you should perform a savepoint for planned restarts. If a savepoint is successful you can be sure to restart from it. Otherwise the platform from data

Tooling for resuming from checkpoints

2017-11-22 Thread dominik
Hey, we are running Flink 1.3.2 with streaming jobs and we are running into issues when we are restarting a complete job (which can happen due to various reasons: upgrading of the job, restarting of the cluster, failures). The problem is that there is no automated way to find out from which ch

Kafka consumer to sync topics by event time?

2017-11-22 Thread Juho Autio
I would like to understand how FlinkKafkaConsumer treats "unbalanced" topics. We're using FlinkKafkaConsumer010 with 2 topics, say "small_topic" & "big_topic". After restoring from an old savepoint (4 hours before), I checked the consumer offsets on Kafka (Flink commits offsets to kafka for refer

Re: Getting java.lang.ClassNotFoundException: for protobuf generated class

2017-11-22 Thread Nico Kruber
But wouldn't a failed dependency show another ClassNotFoundException? On Tuesday, 21 November 2017 20:31:58 CET Gordon Weakliem wrote: > Isn't one cause for ClassNotFoundException that the class can't load due to > failed dependencies or a failure in a static constructor? > > If jar -tf target/pr

Flink stress testing and metrics

2017-11-22 Thread Ladhari Sadok
Hi All, I want to do a stress testing of my Flink app implementation: event generation with ParallelSourceFunction then measuring the latency ,throughput, CPU & memry leak ... But when testing, I noticed that : - the maximum of CPU usage is 30-33% - latency is always NaNd NaNh in the dashb