Re: Endorsed lib in Flink

2019-02-05 Thread Till Rohrmann
Hi Chirag, have you tried excluding the log4j1 jars and adding the log4j 1.2 bridge [1] so that all logging goes through log4j2? [1] https://logging.apache.org/log4j/2.0/log4j-1.2-api/ Cheers, Till On Tue, Feb 5, 2019 at 1:04 PM Chirag Dewan wrote: > Hi, > > Is there some sort of endorsed lib

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Till Rohrmann
Ufuk's proposal (having a lean default release and a user convenience tarball) sounds good to me. That way advanced users won't be bothered by an unnecessarily large release and new users can benefit from having many useful extensions bundled in one tarball. Cheers, Till On Wed, Jan 23, 2019 at 3

Re: Setting flink-conf params in IDE

2019-01-16 Thread Till Rohrmann
Hi Alexandru, you can call `StreamExecutionEnvironment#createLocalEnvironment` which you can pass a Flink configuration object. Cheers, Till On Wed, Jan 16, 2019 at 1:05 PM Alexandru Gutan wrote: > Hi everyone! > > Is there a way to set flink-conf.yaml params but when running from the > IDE? >

Re: StreamingFileSink cannot get AWS S3 credentials

2019-01-16 Thread Till Rohrmann
MR did not work. > > Regards, > Vinay Patil > > > On Wed, Jan 16, 2019 at 3:05 PM Till Rohrmann > wrote: > >> The old BucketingSink was using Hadoop's S3 filesystem directly whereas >> the new StreamingFileSink uses Flink's own FileSystem which need to be

Re: StreamingFileSink cannot get AWS S3 credentials

2019-01-16 Thread Till Rohrmann
> adding fs.s3a.impl to core-site.xml when the default configurations were > not working. > > Regards, > Vinay Patil > > > On Wed, Jan 16, 2019 at 2:55 PM Till Rohrmann > wrote: > >> Hi Vinay, >> >> Flink's file systems are self contained and wo

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-16 Thread Till Rohrmann
Hi Sohimankotia, you can control Flink's failure behaviour in case of a checkpoint failure via the `ExecutionConfig#setFailTaskOnCheckpointError(boolean)`. Per default it is set to true which means that a Flink task will fail if a checkpoint error occurs. If you set it to false, then the job won't

Re: StreamingFileSink cannot get AWS S3 credentials

2019-01-16 Thread Till Rohrmann
Hi Vinay, Flink's file systems are self contained and won't respect the core-site.xml if I'm not mistaken. Instead you have to set the credentials in the flink configuration flink-conf.yaml via `fs.s3a.access.key: access_key`, `fs.s3a.secret.key: secret_key` and so on [1]. Have you tried this out?

Re: Parallelism questions

2019-01-15 Thread Till Rohrmann
I'm not aware of someone working on this feature right now. On Tue, Jan 15, 2019 at 3:22 PM Alexandru Gutan wrote: > Thats great news! > > Are there any plans to expose it in the upcoming Flink release? > > On Tue, 15 Jan 2019 at 12:59, Till Rohrmann wrote: > >&g

Re: Subtask much slower than the others when creating checkpoints

2019-01-15 Thread Till Rohrmann
IN_ON_CANCELLATION. I assume that > using the MemoryStateBackend and CANCELLATION should remove any possibility > of disk/IO congestions, am I wrong?. > > > > Pasquale > > > > *From:* Till Rohrmann > *Sent:* 15 January 2019 10:33 > *To:* Pasquale Vazzana >

Re: Parallelism questions

2019-01-15 Thread Till Rohrmann
; .setUID(..) > .sink(..) > .setParallelism(3) > .setUID(..) > > I think it would be a good idea to have */jobs/:jobid/rescaling,* additionally > requiring the *operatorUID* as a queryParameter*, *so that the > parallelism of specific operators could be changed. > > Best,

Re: Subtask much slower than the others when creating checkpoints

2019-01-15 Thread Till Rohrmann
I've tried to use murmur2 for hashing but it didn't help either. > The subtask that seems causing problems seems to be a CoProcessFunction. > I am going to debug Flink but since I'm relatively new to it, it might > take a while so any help will be appreciated. > > Pasqual

Re: Parallelism questions

2019-01-15 Thread Till Rohrmann
Hi Alexandru, you can use the `modify` command `bin/flink modify --parallelism ` to modify the parallelism of a job. At the moment, it is implemented as first taking a savepoint, stopping the job and then redeploying the job with the changed parallelism and resuming from the savepoint. Cheers, T

Re: Recovery problem 2 of 2 in Flink 1.6.3

2019-01-15 Thread Till Rohrmann
Hi John, this looks indeed strange. How many concurrent operators do you have which write state to s3? After the cancellation, the JobManager should keep the slots for some time until they are freed. This is the normal behaviour and can be controlled with `slot.idle.timeout`. Could you maybe shar

Re: Recovery problem 1 of 2 in Flink 1.6.3

2019-01-15 Thread Till Rohrmann
Hi John, this is definitely not how Flink should behave in this situation and could indicate a bug. From the logs I couldn't figure out the problem. Would it be possible to obtain for the TMs and JM the full logs with DEBUG log level? This would help me to further debug the problem. Cheers, Till

Re: [DISCUSS] Dropping flink-storm?

2019-01-09 Thread Till Rohrmann
ache.org>: > >> That sounds very good to me. >> >> On 08.10.2018 11:36, Till Rohrmann wrote: >> > Good point. The initial idea of this thread was to remove the storm >> > compatibility layer completely. >> > >> > During the discussion I r

Re: ConnectTimeoutException when createPartitionRequestClient

2019-01-09 Thread Till Rohrmann
> On Tue, Jan 8, 2019 at 7:06 AM Till Rohrmann wrote: > >> Hi Wenrui, >> >> the exception now occurs while finishing the connection creation. I'm not >> sure whether this is so different. Could it be that your network is >> overloaded or not very reliable?

Re: Subtask much slower than the others when creating checkpoints

2019-01-08 Thread Till Rohrmann
Hi Bruno, there are multiple reasons why one of the subtasks can take longer for checkpointing. It looks as if there is not much data skew since the state sizes are relatively equal. It also looks as if the individual tasks all start at the same time with the checkpointing which indicates that the

Re: ConnectTimeoutException when createPartitionRequestClient

2019-01-08 Thread Till Rohrmann
ive Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) > at > org.apache.flink.shaded.netty

Re: Unable to restore the checkpoint on restarting the application!!

2019-01-07 Thread Till Rohrmann
; Hi Till > > Its Working for me know ,but *context.isRestored() **is always returning > false.* > > On Fri, Jan 4, 2019 at 7:42 PM Till Rohrmann wrote: > >> When starting a job from within the IDE using the LocalEnvironment, it is >> not possible to specify a checkpoin

Re: ConnectTimeoutException when createPartitionRequestClient

2019-01-07 Thread Till Rohrmann
nitely try the latest flink version. But at Uber, there is a > team who is responsible to provide a platform with Flink. They will upgrade > it at the end of this Month. Meanwhile, I would like to ask some help to > investigate how to increase the connection timeout and make it respected. &g

Re: Unable to restore the checkpoint on restarting the application!!

2019-01-04 Thread Till Rohrmann
: > The List it returns is blank > > On Fri, Jan 4, 2019 at 7:15 PM Till Rohrmann wrote: > >> Hi Puneet, >> >> what exactly is the problem when you try to resume from a checkpoint? >> >> Cheers, >> Till >> >> On Fri, Jan 4, 2019 at 2:31

Re: Unable to restore the checkpoint on restarting the application!!

2019-01-04 Thread Till Rohrmann
Hi Puneet, what exactly is the problem when you try to resume from a checkpoint? Cheers, Till On Fri, Jan 4, 2019 at 2:31 PM Puneet Kinra < puneet.ki...@customercentria.com> wrote: > Hi All > > I am creating a poc where i am trying the out of box feature of flink > for managed state of operator

Re: ConnectTimeoutException when createPartitionRequestClient

2019-01-04 Thread Till Rohrmann
Hi Wenrui, from the logs I cannot spot anything suspicious. Which configuration parameters have you changed exactly? Does the JobManager log contain anything suspicious? The current Flink version changed quite a bit wrt 1.4. Thus, it might be worth a try to run the job with the latest Flink versi

Re: same parallelism with different taskmanager and slots, skew occurs

2019-01-04 Thread Till Rohrmann
Hi, could you tell me how exactly you started the cluster and with which parameters (configured memory, maybe vcores, etc.)? Cheers, Till On Thu, Jan 3, 2019 at 2:37 AM varuy322 wrote: > Hi, Till > It's very kind of your reply. I got your point, I'm sorry to not make it > clear about my issue.

Re: Question about Flink optimizer on Stream API

2019-01-03 Thread Till Rohrmann
Hi Felipe, for streaming Flink currently does not optimize the data flow graph. I think the best reference is actually going through the code as you've done for the batch case. Cheers, Till On Wed, Dec 19, 2018 at 3:14 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hi, > > I was r

Re: How to shut down Flink Web Dashboard in detached Yarn session?

2019-01-02 Thread Till Rohrmann
a standalone cluster > > On 2019/01/02 14:13:52, Till Rohrmann wrote: > > Hi Sai, > > > > could you check that the dashboard you are seeing is really running on > Yarn > > and not a standalone Flink cluster which you have running locally? > > > >

Re: using updating shared data

2019-01-02 Thread Till Rohrmann
e than be kept in the > relevant operator state correct ? and updates (CRUD) on that list will be > preformed via the control stream. correct ? > BR > Avi > > On Wed, Jan 2, 2019 at 4:28 PM Till Rohrmann wrote: > >> Hi Avi, >> >> you could use Flink's broa

Re: Are Jobs allowed to be pending when slots are not enough

2019-01-02 Thread Till Rohrmann
Hi Xinyu, at the moment there is no such functionality in Flink. Whenever you submit a job, Flink will try to execute the job right away. If the job cannot get enough slots, then it will wait until the slot.request.timeout occurs and either fail or retry if you have a RestartStrategy configured.

Re: same parallelism with different taskmanager and slots, skew occurs

2019-01-02 Thread Till Rohrmann
Hi Rui, such a situation can occur if you have data skew in your data set (differently sized partitions if you key by some key). Assume you have 2 TMs with 2 slots each and you key your data by some key x. The partition assignment could look like: TM1: slot_1 = Partition_1, slot_2 = Partition_2 T

Re: Problem when use kafka sink with EXACTLY_ONCE in IDEA

2019-01-02 Thread Till Rohrmann
Hi Kaibo, which Kafka version are you running locally? When enabling exactly once processing guarantees, you need at least Kafka >= 0.11. The UnsupportedVersionException indicates that this constraint is not fulfilled [1]. [1] https://kafka.apache.org/0110/javadoc/index.html?org/apache/kafka/clie

Re: using updating shared data

2019-01-02 Thread Till Rohrmann
Hi Avi, you could use Flink's broadcast state pattern [1]. You would need to use the DataStream API but it allows you to have two streams (input and control stream) where the control stream is broadcasted to all sub tasks. So by ingesting messages into the control stream you can send model updates

Re: RuntimeException with valve output watermark when using CoGroup

2019-01-02 Thread Till Rohrmann
Thanks for the update Taneli. Glad that you solved the problem. If you should find out more about the more obscure case, let us know. Maybe there is something we can still improve to prevent misleading exceptions in the future. Cheers, Till On Tue, Jan 1, 2019 at 3:01 PM Taneli Saastamoinen < tan

Re: How to shut down Flink Web Dashboard in detached Yarn session?

2019-01-02 Thread Till Rohrmann
Hi Sai, could you check that the dashboard you are seeing is really running on Yarn and not a standalone Flink cluster which you have running locally? Cheers, Till On Mon, Dec 31, 2018 at 7:40 PM Sai Inampudi wrote: > Hey Gary, thanks for reaching out. > > Executing "yarn application -list" do

Re: no log exists in JM and TM when updated to Flink 1.7

2019-01-02 Thread Till Rohrmann
Hi Joshua, could you check the content of the logback.xml. Maybe this file has changed between the versions. Cheers, Till On Wed, Dec 26, 2018 at 11:19 AM Joshua Fan wrote: > Hi, > > It is very weird that there is no log file for JM and TM when run flink > job on yarn after updated flink to 1.

Re: 1.6 UI issues

2019-01-02 Thread Till Rohrmann
in SingleJobController and don’t know why when older request is > finished no UI has been changed. We will have a look closer on this > behavior. > > > > Does it ring a bell for you probably? > > > > Thank you > > > > Kind Regards > > Oleksandr > &g

Re: [ANNOUNCE] Apache Flink 1.5.6 released

2018-12-28 Thread Till Rohrmann
Thanks a lot for being our release manager Thomas. Great work! Also thanks to the community for making this release possible. Cheers, Till On Thu, Dec 27, 2018 at 2:58 AM Jeff Zhang wrote: > Thanks Thomas. It's nice to have a more stable flink 1.5.x > > vino yang 于2018年12月27日周四 上午9:43写道: > >>

Re: [ANNOUNCE] Apache Flink 1.6.3 released

2018-12-24 Thread Till Rohrmann
Thanks a lot for being our release manager Gordon. Great job! And also a big thanks to the community for making this release possible. Cheers, Till On Mon, Dec 24, 2018 at 2:11 AM vino yang wrote: > Thanks for being the release manager Gordon. And glad to see Flink 1.6.3 > released. > > Best, >

[ANNOUNCE] Weekly community update #51

2018-12-20 Thread Till Rohrmann
Dear community, this is the weekly community update thread #51. Please post any news and updates you want to share with the community to this thread. # Flink Forward China is happening This week the Flink community meets in Beijing for the first Flink Forward China which takes place from the 20t

Re: [SURVEY] Usage of flink-python and flink-streaming-python

2018-12-19 Thread Till Rohrmann
, most of the users at Alibaba are using SQL. > > Some > > > of > > > > > users at Alibaba want integrated python libraries with Flink for > > > > streaming > > > > > processing, and Jython is unusable. > > > > > > > >

Re: Unbalanced Kafka consumer consumption

2018-12-19 Thread Till Rohrmann
Great to hear and thanks for letting us know. Cheers, Till On Wed, Dec 19, 2018 at 5:39 PM Gerard Garcia wrote: > We finally figure it out. We had a large value in the Kafka consumer > option 'max.partition.fetch.bytes', this made the KafkaConsumer to not > consume at a balanced rate from all p

Re: flink operator's latency metrics continues to increase.

2018-12-19 Thread Till Rohrmann
ec 17, 2018 at 2:37 PM Suxing Lee <913910...@qq.com> wrote: > Hi Till Rohrmann, > > I was running flink 1.5.5, and I use prometheus to collect metrics to > check latency of my jobs. > But sometimes I observerd that the operator's latency metrics continues > to increase i

Re: Flink on Kubernetes (Minikube)

2018-12-19 Thread Till Rohrmann
Hi Alexandru, minikube ssh 'sudo ip link set docker0 promisc on' is not supposed to solve the problem you are seeing. It only resolves the problem if the JobMaster wants to reach itself through the jobmanager-service name. Your problem seems to be something else. Could you check if jobmanager-serv

Re: After job cancel, leftover ZK state prevents job manager startup

2018-12-11 Thread Till Rohrmann
Hi Micah, the problem looks indeed similar to FLINK-10255. Could you tell me your exact cluster setup (HA with stand by JobManagers?). Moreover, the logs of all JobManagers on DEBUG level would be helpful for further debugging. Cheers, Till On Tue, Dec 11, 2018 at 10:09 AM Stefan Richter wrote:

Re: Question regarding rescale api

2018-12-10 Thread Till Rohrmann
Hi Mingliang, Aljoscha is right. At the moment Flink does not support to spread out tasks across all TaskManagers. This is a feature which we still need to add. Until then, you need to set the parallelism to the number of available slots in order to guarantee that all TaskManagers are equally used

[ANNOUNCE] Weekly community update #50

2018-12-10 Thread Till Rohrmann
Dear community, this is the weekly community update thread #50. Please post any news and updates you want to share with the community to this thread. # Unified core API for streaming and batch The community started to discuss how to bring streaming and batch closer together by implementing a com

[SURVEY] Usage of flink-python and flink-streaming-python

2018-12-07 Thread Till Rohrmann
Dear Flink community, in order to better understand the needs of our users and to plan for the future, I wanted to reach out to you and ask how much you use Flink's Python API, namely flink-python and flink-streaming-python. In order to gather feedback, I would like to ask all Python users to res

[ANNOUNCE] Weekly community update #49

2018-12-07 Thread Till Rohrmann
Dear community, this is the weekly community update thread #49. Please post any news and updates you want to share with the community to this thread. # Flink 1.7.0 has been released The community has release Flink 1.7.0 [1]. # Flink intro slide set Fabian has refined the slide set for an intro

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-06 Thread Till Rohrmann
ied in this case? > > Hao Sun > Team Lead > 1019 Market St. 7F > San Francisco, CA 94103 > > > On Wed, Dec 5, 2018 at 10:09 AM Till Rohrmann > wrote: > >> Hi Hao, >> >> I think you need to provide a savepoint file via --fromSavepoint to >> resume

Re: How to distribute subtasks evenly across taskmanagers?

2018-12-05 Thread Till Rohrmann
Hi Sunny, this is a current limitation of Flink's scheduling. We are currently working on extending Flinks scheduling mechanism [1] which should also help with solving this problem. At the moment, I recommend using the per-job mode so that you have a single cluster per job. [1] https://issues.apa

Re: Flink 1.7 job cluster (restore from checkpoint error)

2018-12-05 Thread Till Rohrmann
Hi Hao, I think you need to provide a savepoint file via --fromSavepoint to resume from in order to specify --allowNonRestoredState. Otherwise this option will be ignored because it only works if you resume from a savepoint. Cheers, Till On Wed, Dec 5, 2018 at 12:29 AM Hao Sun wrote: > I am us

Re: long lived standalone job session cluster in kubernetes

2018-12-05 Thread Till Rohrmann
:39 PM Derek VerLee wrote: > Sounds good. > > Is someone working on this automation today? > > If not, although my time is tight, I may be able to work on a PR for > getting us started down the path Kubernetes native cluster mode. > > > On 12/4/18 5:35 AM, Till R

Re: Assigning a port range to rest.port

2018-12-05 Thread Till Rohrmann
Hi Gyula and Jeff, I think at the moment it is not possible to define a port range for the REST client. Maybe we should add something similar to the RestOptions#BIND_ADDRESS, namely introducing a RestOptions#BIND_PORT which can define a port range for the binding port. RestOptions#PORT will only b

Re: long lived standalone job session cluster in kubernetes

2018-12-04 Thread Till Rohrmann
Hi Derek, what I would recommend to use is to trigger the cancel with savepoint command [1]. This will create a savepoint and terminate the job execution. Next you simply need to respawn the job cluster which you provide with the savepoint to resume from. [1] https://ci.apache.org/projects/flink/

Re: Weird behavior in actorSystem shutdown in akka

2018-12-03 Thread Till Rohrmann
Hi Joshua, sorry for getting back to you so late. Personally, I haven't seen this problem before. Without more log context I think I won't be able to help you. This looks a bit more like an Akka problem than a Flink problem to be honest. One cause could be that akka.remote.flush-wait-on-shutdown

Re: Custom scheduler in Flink

2018-11-30 Thread Till Rohrmann
Hi Felipe, https://issues.apache.org/jira/browse/FLINK-10429 might also be interesting. The community is currently working on making the Scheduler pluggable to make it easier to extend this component. Cheers, Till On Wed, Nov 28, 2018 at 2:56 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> w

Re: Apache Flink 1.7.0 jar complete ?

2018-11-30 Thread Till Rohrmann
Hi Arnaud, I tried to setup the same testing project as you've described and it worked for me. Could you maybe try to clear your Maven repository? Maybe not all dependencies had been properly mirrored to Maven central. Cheers, Till On Fri, Nov 30, 2018 at 2:31 PM Till Rohrmann wrote: >

Re: Apache Flink 1.7.0 jar complete ?

2018-11-30 Thread Till Rohrmann
Source text = env.addSource(*new* > SourceFunction() { > > @Override > > *public* *void* run(*final* SourceContext ctx) > *throws* Exception { > > *for* (*int* count = 0; count < 5; count++) { > > ctx.coll

[ANNOUNCE] Apache Flink 1.7.0 released

2018-11-30 Thread Till Rohrmann
The Apache Flink community is very happy to announce the release of Apache Flink 1.7.0, which is the next major release. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications. The release is availab

Re: Logging Kafka during exceptions

2018-11-22 Thread Till Rohrmann
31 371 603 > Mobile : +852 9611 3969 > > 9/F, 33 Lockhart Road, Wan Chai, Hong Kong > www.celer-tech.com > > > > > > > > On 23 Nov 2018, at 00:12, Till Rohrmann wrote: > > Hi Scott, > > I think you could write some Wrappers for the different user fun

Re: Flink restart strategy on specific exception

2018-11-22 Thread Till Rohrmann
Hi Kasif, I think in this situation it is best if you defined your own custom RestartStrategy by specifying a class which has a `RestartStrategyFactory createFactory(Configuration configuration)` method as `restart-strategy: MyRestartStrategyFactoryFactory` in `flink-conf.yaml`. Cheers, Till On

Re: Logging Kafka during exceptions

2018-11-22 Thread Till Rohrmann
Hi Scott, I think you could write some Wrappers for the different user function types which could contain the logging logic. That way you would still need to wrap you actual business logic but don't have to duplicate the logic over and over again. If you also want to log the state, then you would

Re: elasticsearch sink can't connect to elastic cluster with BasicAuth

2018-11-22 Thread Till Rohrmann
Hi, I think you need to a custom `RestClientFactory` which enables basic auth on the ElasticSearch RestClient according to this documentation [1]. You can set the RestClientFactory on the ElasticsearchSink.Builder. [1] https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_basic_

Re: OutOfMemoryError while doing join operation in flink

2018-11-22 Thread Till Rohrmann
Hi Akshay, Flink currently does not support to automatically distribute hot keys across different JVMs. What you can do is to adapt the parallelism/number of partitions manually if you encounter that one partition contains a lot of hot keys. This might mitigate the problem by partitioning the hot

Re: DataSet - Broadcast set in output format

2018-11-22 Thread Till Rohrmann
Hi Bastien, the OutputFormat specifies how a given record is written to an external system. The DataSink using these formats do not support using broadcast variables. This is currently a limitation of Flink. What you could do is to introduce a mapper before your sink which enriches the records wi

Re: Passing application configuring to Flink uber jar

2018-11-22 Thread Till Rohrmann
Hi Krishna, I think the problem is that you are trying to pass in dynamic properties (-Dconfig.file=dev.conf) to an already started the cluster. The Flink cluster components or their JVMs need to know the env.java.opts at cluster start up time and not when the Flink job is submitted. You can check

[ANNOUNCE] Weekly community update #47

2018-11-20 Thread Till Rohrmann
Dear community, this is the weekly community update thread #47. Please post any news and updates you want to share with the community to this thread. # Updates on sharing state between subtasks Jamie opened a first PR to add a first version of sharing state between tasks. It works by using the J

Re: Task Manager allocation issue when upgrading 1.6.0 to 1.6.2

2018-11-13 Thread Till Rohrmann
ill be running in > shared environments like YARN. > > But anyway, thanks for that! I'm now up with 1.6.2. > > Cliff > > On Mon, Nov 12, 2018 at 6:04 AM Till Rohrmann > wrote: > >> Hi Cliff, >> >> the TaskManger fail to start with exit code 31 which

Re: Queryable state when key is UUID - getting Kyro Exception

2018-11-13 Thread Till Rohrmann
ava.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) >>> at >>> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) >>> at >>> org.apache.flink.queryablestate.network.Client$EstablishedConnection.onRequestFailure(Client.

Re: Implementation error: Unhandled exception - "Implementation error: Unhandled exception."

2018-11-13 Thread Till Rohrmann
Hi Richard, could you share with us the complete logs to better debug the problem. What do you mean exactly with upgrading your job? Cancel with savepoint and then resuming the new job from the savepoint? Thanks a lot. Cheers, Till On Mon, Nov 12, 2018 at 5:08 PM Timo Walther wrote: > Hi Richa

Re: Task Manager allocation issue when upgrading 1.6.0 to 1.6.2

2018-11-12 Thread Till Rohrmann
wrote: > Hi Till, > > Here are Job Manager logs, same job in both 1.6.0 and 1.6.2 at DEBUG > level. I saw several errors in 1.6.2, hope it's informative! > > Cliff > > On Fri, Nov 9, 2018 at 8:34 AM Till Rohrmann wrote: > >> Hi Cliff, >> >> this

Re: flink-1.6.1 :: job deployment :: detached mode

2018-11-09 Thread Till Rohrmann
ork.imps.ExistsBuilderImpl$1.processResult(ExistsBuilderImpl.java:137) > at > org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:554) > at > org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505) > > > Kind Regards, > Mike Prya

Re: Task Manager allocation issue when upgrading 1.6.0 to 1.6.2

2018-11-09 Thread Till Rohrmann
Hi Cliff, this sounds not right. Could you share the logs of the Yarn cluster entrypoint with the community for further debugging? Ideally on DEBUG level. The Yarn logs would also be helpful to fully understand the problem. Thanks a lot! Cheers, Till On Thu, Nov 8, 2018 at 9:59 PM Cliff Resnick

Re: Queryable state when key is UUID - getting Kyro Exception

2018-11-09 Thread Till Rohrmann
Could you send us a small example program which we can use to reproduce the problem? Cheers, Till On Fri, Nov 9, 2018 at 6:57 AM Jayant Ameta wrote: > Yeah, it IS using Kryo serializer. > > Jayant Ameta > > > On Wed, Nov 7, 2018 at 9:57 PM Till Rohrmann wrote: > >>

Re: HA jobmanagers redirect to ip address of leader instead of hostname

2018-11-09 Thread Till Rohrmann
t this: > https://issues.apache.org/jira/projects/FLINK/issues/FLINK-10748 > Best regards, > Jeroen Steggink > > On 07-Nov-18 16:06, Till Rohrmann wrote: > > Hi Jeroen, > > this sounds like a bug in Flink that we return sometimes IP addresses > instead of hostnames. Cou

Re: flink-1.6.1 :: job deployment :: detached mode

2018-11-08 Thread Till Rohrmann
the log file attached. > > > Kind Regards, > Mike Pryakhin > > On 7 Nov 2018, at 18:46, Till Rohrmann wrote: > > Hi Mike, > > have you tried whether the problem also occurs with Flink 1.6.2? If yes, > then please share with us the Flink logs with DEBUG log level to fu

Re: FlinkCEP, circular references and checkpointing failures

2018-11-07 Thread Till Rohrmann
passed into the constructor of the new instance. > I will open an issue and fix the problem. > > Best, > Stefan > > On 7. Nov 2018, at 17:17, Till Rohrmann wrote: > > Hi Shailesh, > > could you maybe provide us with an example program which is able to > reproduce this

Re: Error after upgrading to Flink 1.6.2

2018-11-07 Thread Till Rohrmann
Hi Flavio, I haven't seen this problem before. Are you using Flink's HBase connector? According to similar problems with Spark one needs to make sure that the hbase jars are on the classpath [1, 2]. If not, then it might be a problem with the MR1 version 2.6.0-mr1-cdh5.11.2 which caused problems f

Re: flink-1.6.2 in standalone-job mode | Cluster initialization failed.

2018-11-07 Thread Till Rohrmann
Hi Zavalit, the AbstractMethodError indicates that there must be some kind of version conflict. From Flink 1.6.1 to 1.6.2 we modified the signature of `ClusterEntrypoint#createResourceManager` which causes the problem if you mix up versions. Could you check that you don't mix Flink 1.6.1 and 1.6.2

Re: Unbalanced Kafka consumer consumption

2018-11-07 Thread Till Rohrmann
Hi Gerard, the behaviour you are describing sounds odd to me. I have a couple of questions: 1. Which Flink and Kafka version are you using? 2. How many partitions do you have? --> Try to set the parallelism of your job to the number of partitions. That way, you will have one partition per source

Re: Queryable state when key is UUID - getting Kyro Exception

2018-11-07 Thread Till Rohrmann
Hi Jayant, could you check that the UUID key on the TM is actually serialized using a Kryo serializer? You can do this by setting a breakpoint in the constructor of the `AbstractKeyedStateBackend`. Cheers, Till On Tue, Oct 30, 2018 at 9:44 AM bupt_ljy wrote: > Hi, Jayant > > Your code looks

Re: FlinkCEP, circular references and checkpointing failures

2018-11-07 Thread Till Rohrmann
Hi Shailesh, could you maybe provide us with an example program which is able to reproduce this problem? This would help the community to better debug the problem. It looks not right and might point towards a bug in Flink. Thanks a lot! Cheers, Till On Tue, Oct 30, 2018 at 9:10 AM Dawid Wysakowi

Re: InterruptedException when async function is cancelled

2018-11-07 Thread Till Rohrmann
Hi Anil, as Stephan stated, the fix is not included in Flink 1.4.2 but in the later version of Flink. Can you upgrade to Flink 1.5.5 or Flink 1.6.2 to check whether the problem still occurs? Cheers, Till On Sun, Oct 28, 2018 at 8:55 AM Anil wrote: > I do see the same error but in case differen

Re: RocksDB checkpointing dir per TM

2018-11-07 Thread Till Rohrmann
This is a very good point Elias. We actually forgot to add these options to the configuration documentation after a refactoring. I will fix it. Cheers, Till On Fri, Oct 26, 2018 at 8:27 PM Elias Levy wrote: > There is also state.backend.rocksdb.localdir. Oddly, I can find the > documentation f

Re: flink-1.6.1 :: job deployment :: detached mode

2018-11-07 Thread Till Rohrmann
Hi Mike, have you tried whether the problem also occurs with Flink 1.6.2? If yes, then please share with us the Flink logs with DEBUG log level to further debug the problem. Cheers, Till On Fri, Oct 26, 2018 at 5:46 PM Mikhail Pryakhin wrote: > Hi community! > > Righ after I've upgraded flink

Re: 答复: Flink1.6.0 submit job and got "No content to map due to end-of-input" Error

2018-11-07 Thread Till Rohrmann
ror. According to the JIRA issue it should be > fixed in 1.5.4 as well. > > Since I'm using Apache Beam to build the jar, I can't move to version > 1.6.x. > > What could it be? > > Cheers, > > Jeroen > On 07-Sep-18 17:52, Till Rohrmann wrote: >

Re: Flink Task Allocation on Nodes

2018-11-07 Thread Till Rohrmann
Hi Sayat, at the moment it is not possible to control the scheduling behaviour of Flink. In the future, we plan to add some kind of hints which controls whether tasks of a job get spread out or will be packed on as few nodes as possible. Cheers, Till On Fri, Oct 26, 2018 at 2:06 PM Kien Truong

Re: HA jobmanagers redirect to ip address of leader instead of hostname

2018-11-07 Thread Till Rohrmann
Hi Jeroen, this sounds like a bug in Flink that we return sometimes IP addresses instead of hostnames. Could you tell me which Flink version you are using? In the current version, the redirect address and the address retrieved from ZooKeeper should actually be the same. In the future, we plan to

Re: RichInputFormat working differently in eclipse and in flink cluster

2018-11-07 Thread Till Rohrmann
Hi Teena, which Flink version are you using? Have you tried whether this happens with the latest release 1.6.2 as well? Cheers, Till On Fri, Oct 26, 2018 at 1:17 PM Teena Kappen // BPRISE < teena.kap...@bprise.com> wrote: > Hi all, > > > > I have implemented RichInputFormat for reading result o

[ANNOUNCE] Weekly community update #45

2018-11-06 Thread Till Rohrmann
Dear community, this is the weekly community update thread #45. Please post any news and updates you want to share with the community to this thread. # First release candidate for Flink 1.7.0 The community has published the first release candidate for Flink 1.7.0 [0]. Please help the community b

Re: Why dont't have a csv formatter for kafka table source

2018-11-02 Thread Till Rohrmann
Hi Jocean, these kind of issues should go to the user mailing list. I've cross posted it there and put dev to bcc. Cheers, Till On Fri, Nov 2, 2018 at 6:43 AM Jocean shi wrote: > Hi all, > I have encountered a error When i want to register a table from kafka > using csv formatter. >

[ANNOUNCE] Weekly community update #44

2018-10-30 Thread Till Rohrmann
Dear community, this is the weekly community update thread #44. Please post any news and updates you want to share with the community to this thread. # Flink 1.5.5 and 1.6.2 have been released The Flink community is proud to announce the two bugfix release Flink 1.5.5 [1] and Flink 1.6.2 [2] hav

Re: Job fails to restore from checkpoint in Kubernetes with FileNotFoundException

2018-10-30 Thread Till Rohrmann
As Vino pointed out, you need to configure a checkpoint directory which is accessible from all TMs. Otherwise you won't be able to recover the state if the task gets scheduled to a different TaskManager. Usually, people use HDFS or S3 for that. Cheers, Till On Tue, Oct 30, 2018 at 9:50 AM vino ya

Re: [ANNOUNCE] Apache Flink 1.6.2 released

2018-10-29 Thread Till Rohrmann
Awesome! Thanks a lot to you Chesnay for being our release manager and to the community for making this release happen. Cheers, Till On Mon, Oct 29, 2018 at 8:37 AM Chesnay Schepler wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink 1.6.2, which is the s

Re: [ANNOUNCE] Apache Flink 1.5.5 released

2018-10-29 Thread Till Rohrmann
Great news. Thanks a lot to you Chesnay for being our release manager and the community for making this release possible. Cheers, Till On Mon, Oct 29, 2018 at 8:36 AM Chesnay Schepler wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink 1.5.5, which is the

Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Till Rohrmann
Hi Mike, thanks for reporting this issue. I think you're right that Flink leaves some empty nodes in ZooKeeper. It seems that we don't delete the node with all its children in ZooKeeperHaServices#closeAndCleanupAllData. Could you please open a JIRA issue to in order to fix it? Thanks a lot! Che

Re: Task manager count goes the expand then converge process when running flink on YARN

2018-10-25 Thread Till Rohrmann
Hi Henry, since version 1.5 you don't need to specify the number of TaskManagers to start, because the system will figure this out. Moreover, in version 1.5.x and 1.6.x it is recommended to set the number of slots per TaskManager to 1 since we did not support multi task slot TaskManagers properly.

[ANNOUNCE] Flink Forward San Francisco 2019 - Call for Presentations is now open

2018-10-24 Thread Till Rohrmann
Hi everybody, the Call for Presentations for Flink Forward San Francisco 2019 is now open! Apply by November 30 to share your compelling Flink use case, best practices, and latest developments with the community on April 1-2 in San Francisco, CA. Submit your proposal: https://flink-forward.org/ca

[ANNOUNCE] Weekly community update #43

2018-10-23 Thread Till Rohrmann
Dear community, this is the weekly community update thread #43. Please post any news and updates you want to share with the community to this thread. # Release vote for Flink 1.5.5 and 1.6.2 The community is currently voting on the first release candidates for Flink 1.5.5 [1] and Flink 1.6.2 [2]

[ANNOUNCE] Weekly community update #42

2018-10-17 Thread Till Rohrmann
Dear community, this is the weekly community update thread #42. Please post any news and updates you want to share with the community to this thread. # Discussion about Flink SQL integration with Hive Xuefu started a discussion about how to integrate Flink SQL with the Hive ecosystem [1]. If tha

Re: Flink1.6 Confirm that flink supports the plan function?

2018-10-15 Thread Till Rohrmann
Hi, 1) you currently cannot merge multiple jobs into one after they have been submitted. What you can do though, is to combine multiple jobs in your Flink program before you submit it. 2) you can pass program arguments when you submit your job. After it has been submitted, it is no longer possibl

<    2   3   4   5   6   7   8   9   10   11   >