Re: Task manager count goes the expand then converge process when running flink on YARN

2018-10-24 Thread vino yang
Hi Henry, The phenomenon you expressed is there, this is a bug, but I can't remember its JIRA number. Thanks, vino. 徐涛 于2018年10月24日周三 下午11:27写道: > Hi experts > I am running flink job on YARN in job cluster mode, the job is divided > into 2 tasks, the following are some configs of the job: >

Re: FlinkCEP, circular references and checkpointing failures

2018-10-24 Thread Shailesh Jain
Hi Dawid, I've upgraded to flink 1.6.1 and rebased by changes against the tag 1.6.1, the only commit on top of 1.6 is this: https://github.com/jainshailesh/flink/commit/797e3c4af5b28263fd98fb79daaba97cabf3392c I ran two separate identical jobs (with and without checkpointing enabled), I'm

RocksDB State Backend Exception

2018-10-24 Thread Ning Shi
Hi, We are doing some performance testing on a 12 node cluster with 8 task slots per TM. Every 15 minutes or so, the job would run into the following exception. java.lang.IllegalArgumentException: Illegal value provided for SubCode. at

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-24 Thread Zhang, Xuefu
Hi all, To wrap up the discussion, I have attached a PDF describing the proposal, which is also attached to FLINK-10556 [1]. Please feel free to watch that JIRA to track the progress. Please also let me know if you have additional comments or questions. Thanks, Xuefu [1]

Re: Dynamically Generated Classes - Cannot load user class

2018-10-24 Thread shkob1
OK I think i figured it out - not sure though exactly the reason: It seems that i need to have a stream type - Generic Type of the super class - rather than a Pojo of the concrete generated class. It seems like the operation definition otherwise cannot load the Pojo class on the task creation. So

Re: Question over Incremental Snapshot vs Full Snapshot in rocksDb state backend

2018-10-24 Thread chandan prakash
Thanks Tzu-Li for redirecting. Would also like to be corrected if my any inference from the code is incorrect or incomplete. I am sure it will help to clear doubts of more developers like me :) Thanks in advance. Regards, Chandan On Wed, Oct 24, 2018 at 9:19 PM Tzu-Li (Gordon) Tai wrote: >

[ANNOUNCE] Flink Forward San Francisco 2019 - Call for Presentations is now open

2018-10-24 Thread Till Rohrmann
Hi everybody, the Call for Presentations for Flink Forward San Francisco 2019 is now open! Apply by November 30 to share your compelling Flink use case, best practices, and latest developments with the community on April 1-2 in San Francisco, CA. Submit your proposal:

Re: Fail to recover Keyed State afeter ReinterpretAsKeyedStream

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi Jose, As far as I know, you should be able to use keyed state on a stream returned by DataStreamUtils.reinterpretAsKeyedStream function. That shouldn’t be the issue here. Have you looked into the logs for any meaningful exceptions of why the restore failed? That would be helpful here to

Re: cannot find symbol of "fromargs"

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi! How are you packaging your Flink program? This looks like a simple dependency error. If you don’t know where to start when beginning to write your Flink program, the quickstart Maven templates are always a good place to begin with [1]. Cheers, Gordon [1] 

Re: Question over Incremental Snapshot vs Full Snapshot in rocksDb state backend

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi, I’m forwarding this question to Stefan (cc’ed). He would most likely be able to answer your question, as he has done substantial work in the RocksDB state backends. Cheers, Gordon On 24 October 2018 at 8:47:24 PM, chandan prakash (chandanbaran...@gmail.com) wrote: Hi, I am new to Flink.

Re: Get request header from Kinesis

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi, Could you point to the AWS Kinesis Java API that exposes record headers? As far as I can tell from the Javadoc, I can’t seem to find how to retrieve headers from Kinesis records. If there is a way to do that, then it might make sense to expose that from the Kinesis connector’s

Guava conflict when using flink kinesis consumer with grpc protobuf

2018-10-24 Thread Vijay Balakrishnan
Hi, I have a dependency on guava in grpc protobuf as follows: com.google.guava guava 26.0-jre I also use Flink Kinesis Connector in the same project: org.apache.flink flink-connector-kinesis_${scala.binary.version} ${flink.version} This Flink Kinesis connector has a

Task manager count goes the expand then converge process when running flink on YARN

2018-10-24 Thread 徐涛
Hi experts I am running flink job on YARN in job cluster mode, the job is divided into 2 tasks, the following are some configs of the job: parallelism.default => 16 taskmanager.numberOfTaskSlots => 8 -yn => 2 when the program starts, I found that the count

Task manager count goes the expand then converge process when running flink on YARN

2018-10-24 Thread 徐涛
Hi experts I am running flink job on YARN in job cluster mode, the job is divided into 2 tasks, the following are some configs of the job: parallelism.default => 16 taskmanager.numberOfTaskSlots => 8 -yn => 2 when the program starts, I found that the count

Flink weird checkpointing behaviour

2018-10-24 Thread Pawel Bartoszek
Hi, We have just upgraded to Flink 1.5.2 on EMR from Flink 1.3.2. We have noticed that some checkpoints are taking a very long time to complete some of them event fails with exception Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/jobmanager_0#-665361795]]

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-24 Thread Aaron Levin
Hey, First, I appreciate everyone's help! Thank you! I wrote several wrappers to try and debug this, including one which is an exact copy of `InputFormatSourceFunction` which also failed. They all failed with the same error I detail above. I'll post two of them below. They all extended

Fail to recover Keyed State afeter ReinterpretAsKeyedStream

2018-10-24 Thread Jose Cisneros
Hi, To avoid reshuffling in my job, I started using DataStreamUtils. reinterpretAsKeyedStream to avoid having to do another keyBy for the same key. The BackEndState is RocksDB. When the job recovers after a failure, the ProcessFunction after the keyBy restores its Keyed State correctly,

Re: Clean shutdown of streaming job

2018-10-24 Thread Ning Shi
Hi Neils, Thanks for the response. > I think your problem is that the Cassandra sink doesn't support exactly > once guarantees when the Cassandra query isn't idempotent. If possible, the > cleanest solution would be implementing a new or extending the existing > Cassandra sink with the >

cannot find symbol of "fromargs"

2018-10-24 Thread Mar_zieh
Hello I am new in Flink. I want to write a program in stream processing. I added this line to my program: ParameterTool mmm = new ParameterTool.fromArgs(args); But I got this error: cannot find symbol of "fromargs" would you please let me know how to solve this error? Thank you in advance.

Re: Flink Task Allocation on Nodes

2018-10-24 Thread Kien Truong
Hi, You can have multiple Flink clusters on the same set of physical machines. In our experience, it's best to deploy a separate Flink cluster for each job and adjust the resource accordingly. Best regards, Kien On Oct 24, 2018 at 20:17, > wrote: Flink Cluster in standalone with HA

Re: Flink Task Allocation on Nodes

2018-10-24 Thread Sayat Satybaldiyev
Flink Cluster in standalone with HA configuration. It has 6 Task managers and each has 8 slots. Overall, 48 slots for the cluster. >>If you cluster only have one task manager with one slot in each node, then the job should be spread evenly. Agree, this will solve the issue. However, the cluster

HighAvailability :: FsNegativeRunningJobsRegistry

2018-10-24 Thread Mikhail Pryakhin
Hi guys, I'm trying to substitute Zookeeper-based HA registry with YARN-based HA registry. (The idea was taken from the issue https://issues.apache.org/jira/browse/FLINK-5254) In Flink 1.6.1, there exists an org.apache.flink.runtime.highavailability.FsNegativeRunningJobsRegistry which claims

Question over Incremental Snapshot vs Full Snapshot in rocksDb state backend

2018-10-24 Thread chandan prakash
Hi, I am new to Flink. Was looking into the code to understand how Flink does FullSnapshot and Incremental Snapshot using RocksDB What I understood: 1. *For full snapshot, we call RocksDb snapshot api* which basically an iterator handle to the entries in RocksDB instance. We iterate over every

Re: Flink Task Allocation on Nodes

2018-10-24 Thread Kien Truong
Hi, How are your task managers deploy ? If you cluster only have one task manager with one slot in each node, then the job should be spread evenly. Regards, Kien On 10/24/2018 4:35 PM, Sayat Satybaldiyev wrote: Is there any way to indicate flink not to allocate all parallel tasks on one

Re: Checkpoint acknowledge takes too long

2018-10-24 Thread Hequn Cheng
Hi Henry, @Kien is right. Take a thread dump to see what was doing in the TaskManager. Also check whether gc happens frequently. Best, Hequn On Wed, Oct 24, 2018 at 5:03 PM 徐涛 wrote: > Hi > I am running a flink application with parallelism 64, I left the > checkpoint timeout default

Re: Size of Checkpoints increasing with time

2018-10-24 Thread Kien Truong
Hi, Do you use incremental checkpoint ? RocksDB is an append-only DB, so you will experience the steady increase in state size until a compaction occurs and old values of keys are garbage-collected. However, the average state size should stabilize after a while, if the load doesn't change.

Re: Checkpoint acknowledge takes too long

2018-10-24 Thread Kien Truong
Hi, In my experience, this is most likely due to one sub-task is blocked doing some long-running operation. Try to run the task manager with some profiler (like VisualVM) and check for hot spot. Regards, Kien On 10/24/2018 4:02 PM, 徐涛 wrote: Hi I am running a flink application

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-24 Thread Kien Truong
Hi, Since InputFormatSourceFunction is a subclass of RichParallelSourceFunction, your wrapper should also extend this class. In addition, remember to overwrite the methods defined in the AbstractRichFunction interface and proxy the call to the underlying InputFormatSourceFunction, in order

Flink Task Allocation on Nodes

2018-10-24 Thread Sayat Satybaldiyev
Is there any way to indicate flink not to allocate all parallel tasks on one node? We have a stateless flink job that reading from 10 partition topic and have a parallelism of 6. Flink job manager allocates all 6 parallel operators to one machine, causing all traffic from Kafka allocated to only

Checkpoint acknowledge takes too long

2018-10-24 Thread 徐涛
Hi I am running a flink application with parallelism 64, I left the checkpoint timeout default value, which is 10minutes, the state size is less than 1MB, I am using the FsStateBackend. The application triggers some checkpoints but all of them fails due to "Checkpoint expired

Error migrating to 1.6

2018-10-24 Thread Juan Gentile
Hello! We are trying to migrate from 1.4 to 1.6 and we are getting the following exception in our jobs: org.apache.flink.util.FlinkException: The assigned slot container_e293_1539164595645_3455869_01_011241_2 was removed. at

Re: Error with custom `SourceFunction` wrapping `InputFormatSourceFunction` (Flink 1.6)

2018-10-24 Thread Dawid Wysakowicz
Hi Aaron, Could you share the code of you custom function? I am also adding Aljosha and Kostas to cc, who should be more helpful on that topic. Best, Dawid On 19/10/2018 20:06, Aaron Levin wrote: > Hi, > > I'm writing a custom `SourceFunction` which wraps an underlying >

Re: Manual trigger the window in fold operator or incremental aggregation

2018-10-24 Thread Dominik Wosiński
Hey Zhen Li, What are You trying to do exactly? Maybe there is a more suitable method than manually triggering windows available in Flink. Best Regards, Dom. śr., 24 paź 2018 o 09:25 Dawid Wysakowicz napisał(a): > Hi Zhen Li, > > As far as I know that is not possible. For such custom handling

Flink Error - Remote system has been silent for too long

2018-10-24 Thread Anil
The Flink jobs are deployed in Yarn cluster. I am seeing the following log for some of my jobs in Job Manager. I'm using Flink 1.4. The job has, taskmanager.exit-on-fatal-akka-error=true. But I don't see the task manager being restarted. I made the following observations - 1. One job does a

Re: Manual trigger the window in fold operator or incremental aggregation

2018-10-24 Thread Dawid Wysakowicz
Hi Zhen Li, As far as I know that is not possible. For such custom handling I would recommend having a look at ProcessFunction[1], where you have access to timers and state. Best, Dawid [1]

Re: Window State is not being store on check-pointing

2018-10-24 Thread Dawid Wysakowicz
Hi, Do you mean that you stop your job manually and then start it? Checkpoints are used in case of failures and are 1) automatically not persisted across separate job runs (unless you set them to be externalized) 2) are not automatically picked up for starting your job. For your case when you