Re: Data loss when restoring from savepoint

2019-01-10 Thread Juho Autio
Stefan, would you have time to comment? On Wednesday, January 2, 2019, Juho Autio wrote: > Bump – does anyone know if Stefan will be available to comment the latest > findings? Thanks. > > On Fri, Dec 21, 2018 at 2:33 PM Juho Autio wrote: > >> Stefan, I managed to analyze savepoint with bravo.

Re: [Flink 1.7.0] always got 10s ask timeout exception when submitting job with checkpoint via REST

2019-01-10 Thread Steven Wu
Gary, thanks a lot. web.timeout seems to help. now I ran into a diff issue with loading the checkpoint. will take that separately. On Thu, Jan 10, 2019 at 12:25 PM Gary Yao wrote: > Hi all, > > I think increasing the default value of the config option web.timeout [1] > is > what you are

Recovery problem 2 of 2 in Flink 1.6.3

2019-01-10 Thread John Stone
This is the second of two recovery problems I'm seeing running Flink in Kubernetes. I'm posting them in separate messages for brevity and because the second is not directly related to the first. Any advice is appreciated. First problem:

Recovery problem 1 of 2 in Flink 1.6.3

2019-01-10 Thread John Stone
This is the first of two recovery problems I'm seeing running Flink 1.6.3 in Kubernetes. I'm posting them in separate messages for brevity and because the second is not directly related to the first. Any advice is appreciated. Setup: Flink 1.6.3 in Kubernetes

Re: [Flink 1.7.0] always got 10s ask timeout exception when submitting job with checkpoint via REST

2019-01-10 Thread Gary Yao
Hi all, I think increasing the default value of the config option web.timeout [1] is what you are looking for. Best, Gary [1] https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/RestHandlerConfiguration.java#L76 [2]

Re: [Flink 1.7.0] always got 10s ask timeout exception when submitting job with checkpoint via REST

2019-01-10 Thread Aaron Levin
We are also experiencing this! Thanks for speaking up! It's relieving to know we're not alone :) We tried adding `akka.ask.timeout: 1 min` to our `flink-conf.yaml`, which did not seem to have any effect. I tried adding every other related akka, rpc, etc. timeout and still continue to encounter

[Flink 1.7.0] always got 10s ask timeout exception when submitting job with checkpoint via REST

2019-01-10 Thread Steven Wu
We are trying out Flink 1.7.0. We always get this exception when submitting a job with external checkpoint via REST. Job parallelism is 1,600. state size is probably in the range of 1-5 TBs. Job is actually started. Just REST api returns this failure. If we submitting the job without external

Re: Reducing runtime of Flink planner

2019-01-10 Thread Niklas Teichmann
Hi Fabian and Timo, Thanks for your answers! At the moment we're working at updating our project to Flink 1.7, so that we can check if the commit you wrote about solves the problem. The debugging we did so far seems to point to calcite as being responsible for the long planning times -

Re: Custom Serializer for Avro GenericRecord

2019-01-10 Thread Tzu-Li (Gordon) Tai
Hi, Have you looked at [1]? You can annotate your type and provide a type info factory. The factory would be used to create the TypeInformation for that type, and in turn create the serializer used for that type. [1]

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala
Ok, thanks for the clarification. Really appreciate your help Kostas On Thu 10 Jan, 2019, 6:19 PM Kostas Kloudas Hi Taher, > > Well, I would say there is no single class that implements it. > In a nutshell, it is the StreamingFileSink that (through Buckets) tells > the responsible Bucket what to

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Kostas Kloudas
Hi Taher, Well, I would say there is no single class that implements it. In a nutshell, it is the StreamingFileSink that (through Buckets) tells the responsible Bucket what to do at each step of the lifecycle of the Flink operator (mainly on element, on checkpoint, on checkpoint completed and on

Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-10 Thread Kostas Kloudas
Hi Gagan, I agree with Congxian! In MapState, when accessing the state/value associated with a key in the map, then the whole value is de-serialized (and serialized in case of a put()). Given this, it is more efficient to have many keys, with small state, than fewer keys with huge state. Cheers,

Re: Multiple MapState vs single nested MapState in stateful Operator

2019-01-10 Thread Congxian Qiu
Hi, Gagan Agrawal In my opinion, I prefer the first. Here is the reason. In RocksDB StateBackend, we will serialize the key, namespace, user-key into a serialized bytes (key-bytes) and serialize user-value to serialized bytes(value-bytes) then insert into the key-bytes/value-bytes into

Re: Reducing runtime of Flink planner

2019-01-10 Thread Fabian Hueske
Hi Niklas, The planning time of a job does not depend on the data size. It would be the same whether you process 5MB or 5PB. FLINK-10566 (as pointed to by Timo) fixed a problem for plans with many braching and joining nodes. Looking at your plan, there are some, but (IMO) not enough to be

Re: windowAll and AggregateFunction

2019-01-10 Thread CPC
I converted to this SingleOutputStreamOperator> tuple2Stream = sourceStream.map(new RichMapFunction>() { @Override public Tuple2 map(XMPP value) throws Exception { return Tuple2.of(getRuntimeContext().getIndexOfThisSubtask(), value); } });

Re: [DISCUSS] Dropping flink-storm?

2019-01-10 Thread Fabian Hueske
+1 from my side as well. I would assume that most Bolts that are supported by our current wrappers can be easily converted into respective Flink functions. Fabian Am Do., 10. Jan. 2019 um 10:35 Uhr schrieb Kostas Kloudas < k.klou...@da-platform.com>: > +1 to drop as well. > > On Thu, Jan 10,

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala
Hi Kostas, Thanks you for the clarification, also can you please point how StreamingFileSink uses TwoPhaseCommit. Can you also point out the implementing class for that? Regards, Taher Koitawala GS Lab Pune +91 8407979163 On Thu, Jan 10, 2019 at 3:10 PM Kostas Kloudas wrote: >

Re: [DISCUSS] Dropping flink-storm?

2019-01-10 Thread Kostas Kloudas
+1 to drop as well. On Thu, Jan 10, 2019 at 10:15 AM Ufuk Celebi wrote: > +1 to drop. > > I totally agree with your reasoning. I like that we tried to keep it, > but I don't think the maintenance overhead would be justified. > > – Ufuk > > On Wed, Jan 9, 2019 at 4:09 PM Till Rohrmann wrote: >

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala
StreamingFileSink extends RichSinkFunction and implements CheckpointedFunction, CheckpointListener and ProcessingTimeCallback however TwoPhaseCommitSinkFunction is never used by StreamingFileSink. Hence I had a question if the sink uses the TwoPhaseCommit protocol or not. Regards, Taher

Re: [DISCUSS] Dropping flink-storm?

2019-01-10 Thread Ufuk Celebi
+1 to drop. I totally agree with your reasoning. I like that we tried to keep it, but I don't think the maintenance overhead would be justified. – Ufuk On Wed, Jan 9, 2019 at 4:09 PM Till Rohrmann wrote: > > With https://issues.apache.org/jira/browse/FLINK-10571, we will remove the > Storm

Re: SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Kostas Kloudas
Hi Taher, The StreamingFileSink implements a version of TwoPhaseCommit. Can you elaborate a bit on what do you mean by " TwoPhaseCommit is not being used"? Cheers, Kostas On Thu, Jan 10, 2019 at 9:29 AM Taher Koitawala wrote: > Hi All, > As per my understanding and the API of

SteamingFileSink with TwoPhaseCommit

2019-01-10 Thread Taher Koitawala
Hi All, As per my understanding and the API of StreamingFileSink, TwoPhaseCommit is not being used. Can someone please confirm is that's right? Also if StreamingFileSink does not support TwoPhaseCommits what is the best way to implement this? Regards, Taher Koitawala GS