Re: UI stability at high parallelism

2020-02-14 Thread Weihua Hu
These logs prove that it is indeed a timeout issue, In our scenario, it was due to the task deploy took a lot of time. You can check if the time from Task from SCHEDULED to DEPLOYING in the log is greater than 10s. This step are processed in mainThread and will block the processing of requests f

KafkaFetcher closed before end of stream is received for all partitions.

2020-02-14 Thread Bill Wicker
Hello all, I'm new to Flink and I have been developing a series of POCs in preparation for a larger project that will utilize Flink. One use case we have is to utilize the same job for both batch and streaming processing using Kafka as the source. When the job is run in batch mode we expect tha

Flink Savepoint error

2020-02-14 Thread Soheil Pourbafrani
Hi, I developed a Flink application that read data from files and inserts them into the database. During the job running, I attempted to get a savepoint and cancel the job but I got the following error: Caused by: java.util.concurrent.CompletionException: > org.apache.flink.runtime.checkpoint.Che

TaskManager Fail when I cancel the job and crash

2020-02-14 Thread Soheil Pourbafrani
Hi, I developed a single Flink job that read a huge amount of files and after some simple preprocessing, sink them into the database. I use the built-in JDBCOutputFormat for inserting records into the database. The problem is when I cancel the job using either the WebUI or the command line, the jo

Re: Flink job fails with org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

2020-02-14 Thread John Smith
Hi Piotr, any thoughts on this? On Wed., Feb. 12, 2020, 3:29 a.m. Kostas Kloudas, wrote: > Hi John, > > As you suggested, I would also lean towards increasing the number of > allowed open handles, but > for recommendation on best practices, I am cc'ing Piotr who may be > more familiar with the K

Re: UI stability at high parallelism

2020-02-14 Thread Richard Moorhead
2020-02-14 11:50:35,402 ERROR org.apache.flink.runtime.rest.handler.job.JobsOverviewHandler - Unhandled exception. akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#1293527273]] after [1 ms]. Message of type [org.apache.flink.runtime.rpc.messages.LocalFenced

RE: Table API: Joining on Tables of Complex Types

2020-02-14 Thread Hailu, Andreas
Hi Timo, Dawid, This was very helpful - thanks! The Row type seems to only support getting fields by their index. Is there a way to get a field by its name like the Row class in Spark? Link: https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/Row.html#getAs(java.lang.String) Our

Re: Using multithreaded library within ProcessFunction with callbacks relying on the out parameter

2020-02-14 Thread Salva Alcántara
Hi Piotr, Since my current process function already works well for me, except for the fact I don't have access to the mailbox executor, I have simply created a custom operator for injecting that: ``` class MyOperator(myFunction: MyFunction) extends KeyedCoProcessOperator(myFunction) { privat

Persisting inactive state outside Flink

2020-02-14 Thread Akshay Aggarwal
Hi, We have a use case where we have to persist some state information about a device forever. Each new event will fetch the keyed state and update it. And this has to be applied in-order of events. The problem is that the number of devices (keys) will keep growing infinitely. Usually a device co

Re: [ANNOUNCE] Flink Forward San Francisco 2020 Program is Live

2020-02-14 Thread Fabian Hueske
Hi everyone, Sorry for writing another email but I forgot to mention the community discounts. When you register for the conference [1], you can use one of the following discount codes: * As a member of the Flink community we offer a 50% discount on your conference pass if you register with the co

[Flink 1.10] Classpath doesn't include custom files in lib/

2020-02-14 Thread Maxim Parkachov
Hi everyone, I'm trying to run my job with flink 1.10 with YARN cluster per-job mode. In the previous versions all files in lib/ folder were automatically included in classpath. Now, with 1.10 I see only *.jar files are included in classpath. but not "other" files. Is this deliberate change or bug

[ANNOUNCE] Flink Forward San Francisco 2020 Program is Live

2020-02-14 Thread Fabian Hueske
Hi everyone, We announced the program of Flink Forward San Francisco 2020. The conference takes place at the Hyatt Regency in San Francisco from March 23rd to 25th. On the first day we offer four training sessions [1]: * Apache Flink Developer Training * Apache Flink Runtime & Operations Training

Re:Re: Flink 1.10 es sink exception

2020-02-14 Thread sunfulin
Hi, Jark Appreciate for your reply. insert with column list indeed is not allowed with old planner enabled in Flink 1.10 while it will throws exception such as "Partial insert is not supported". Never mind for this issue. Focus on the UpsertMode exception, my es DDL is like the following: CR

Re: How to fully re-aggregate a keyed windowed aggregate in the same window ?

2020-02-14 Thread Robert Metzger
Hey Arnaud, sorry that you didn't get an answer yet. Were you able to solve your problem in the meantime? If not, I'll find somebody to answer your question :) On Thu, Jan 30, 2020 at 9:18 AM LINZ, Arnaud wrote: > Hello, > > > > I would like to compute statistics on a stream every hour. For that

Re: Flink 1.10 es sink exception

2020-02-14 Thread Jark Wu
Hi sunfulin, Is this the real query you submit? AFAIK, insert with column list is not allowed for now, i.e. the `INSERT INTO ES6_ZHANGLE_OUTPUT(aggId, pageId, ts, expoCnt, clkCnt)`. Could you attach the full SQL text, including DDLs of ES6_ZHANGLE_OUTPUT table and kafka_zl_etrack_event_stream t

Re: Re: Flink connect hive with hadoop HA

2020-02-14 Thread Robert Metzger
There's a configuration value "env.hadoop.conf.dir" to set the hadoop configuration directory: https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#env-hadoop-conf-dir If the files in that directory correctly configure Hadoop HA, the client side should pick up the config. On Tue,

Re: 1.9 timestamp type default

2020-02-14 Thread Timo Walther
Hi, the type system is still under heavy refactoring that touches a lot of interfaces. Where would you like to use java.sql.Timestamp? UDFs are not well supported right now. Source and sinks might work for the Blink planner and java.sql.Timestamp is the only supported conversion class of old

Re: AW: How Flink Kafka Consumer works when it restarts

2020-02-14 Thread Robert Metzger
The main benefit of letting Flink keep the offsets is that you get exactly once semantics (with the offsets in Flink state, it is aligned with all your other state). When storing the offsets in Kafka, you get at least once semantics (= you are seeing some messages twice on restore / when continuing

Re: group by optimizations with sorted input

2020-02-14 Thread Robert Metzger
I assume you are using the DataSet API. There, you can do a combinable group reduce: https://ci.apache.org/projects/flink/flink-docs-master/dev/batch/dataset_transformations.html#combinable-groupreducefunctions The combine() method will be executed on the sender side, reducing the amount of data t

[Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-14 Thread Niels Basjes
Hi, I have test code ( https://github.com/nielsbasjes/yauaa/blob/v5.15/udfs/flink/src/test/java/nl/basjes/parse/useragent/flink/TestUserAgentAnalysisMapperInline.java#L140 ) that writes a DataStream to a List<> using LocalCollectionOutputFormat to verify if the pipeline did what it should do. Lis

Re: Batch reading from Cassandra. How to?

2020-02-14 Thread Lasse Nedergaard
Any good suggestions? Lasse Den tir. 11. feb. 2020 kl. 08.48 skrev Lasse Nedergaard < lassenederga...@gmail.com>: > Hi. > > We would like to do some batch analytics on our data set stored in > Cassandra and are looking for an efficient way to load data from a single > table. Not by key, but rand

Re: [ANNOUNCE] Apache Flink 1.10.0 released

2020-02-14 Thread Till Rohrmann
Congratulations to everyone and a big thanks to our release managers! Cheers, Till On Thu, Feb 13, 2020 at 2:41 PM Oytun Tez wrote: > 😍 > > On Thu, Feb 13, 2020 at 2:26 AM godfrey he wrote: > >> Congrats to everyone involved! Thanks, Yu & Gary. >> >> Best, >> godfrey >> >> Yu Li 于2020年2月13日周四