Re: HA jobmanagers redirect to ip address of leader instead of hostname

2018-11-07 Thread Till Rohrmann
Hi Jeroen, this sounds like a bug in Flink that we return sometimes IP addresses instead of hostnames. Could you tell me which Flink version you are using? In the current version, the redirect address and the address retrieved from ZooKeeper should actually be the same. In the future, we plan to

Re: RichInputFormat working differently in eclipse and in flink cluster

2018-11-07 Thread Till Rohrmann
Hi Teena, which Flink version are you using? Have you tried whether this happens with the latest release 1.6.2 as well? Cheers, Till On Fri, Oct 26, 2018 at 1:17 PM Teena Kappen // BPRISE < teena.kap...@bprise.com> wrote: > Hi all, > > > > I have implemented RichInputFormat for reading result

[ANNOUNCE] Weekly community update #45

2018-11-06 Thread Till Rohrmann
Dear community, this is the weekly community update thread #45. Please post any news and updates you want to share with the community to this thread. # First release candidate for Flink 1.7.0 The community has published the first release candidate for Flink 1.7.0 [0]. Please help the community

Re: Why dont't have a csv formatter for kafka table source

2018-11-02 Thread Till Rohrmann
Hi Jocean, these kind of issues should go to the user mailing list. I've cross posted it there and put dev to bcc. Cheers, Till On Fri, Nov 2, 2018 at 6:43 AM Jocean shi wrote: > Hi all, > I have encountered a error When i want to register a table from kafka > using csv formatter. >

[ANNOUNCE] Weekly community update #44

2018-10-30 Thread Till Rohrmann
Dear community, this is the weekly community update thread #44. Please post any news and updates you want to share with the community to this thread. # Flink 1.5.5 and 1.6.2 have been released The Flink community is proud to announce the two bugfix release Flink 1.5.5 [1] and Flink 1.6.2 [2]

Re: Job fails to restore from checkpoint in Kubernetes with FileNotFoundException

2018-10-30 Thread Till Rohrmann
As Vino pointed out, you need to configure a checkpoint directory which is accessible from all TMs. Otherwise you won't be able to recover the state if the task gets scheduled to a different TaskManager. Usually, people use HDFS or S3 for that. Cheers, Till On Tue, Oct 30, 2018 at 9:50 AM vino

Re: [ANNOUNCE] Apache Flink 1.6.2 released

2018-10-29 Thread Till Rohrmann
Awesome! Thanks a lot to you Chesnay for being our release manager and to the community for making this release happen. Cheers, Till On Mon, Oct 29, 2018 at 8:37 AM Chesnay Schepler wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink 1.6.2, which is the

Re: [ANNOUNCE] Apache Flink 1.5.5 released

2018-10-29 Thread Till Rohrmann
Great news. Thanks a lot to you Chesnay for being our release manager and the community for making this release possible. Cheers, Till On Mon, Oct 29, 2018 at 8:36 AM Chesnay Schepler wrote: > The Apache Flink community is very happy to announce the release of > Apache Flink 1.5.5, which is

Re: Flink-1.6.1 :: HighAvailability :: ZooKeeperRunningJobsRegistry

2018-10-26 Thread Till Rohrmann
Hi Mike, thanks for reporting this issue. I think you're right that Flink leaves some empty nodes in ZooKeeper. It seems that we don't delete the node with all its children in ZooKeeperHaServices#closeAndCleanupAllData. Could you please open a JIRA issue to in order to fix it? Thanks a lot!

Re: Task manager count goes the expand then converge process when running flink on YARN

2018-10-25 Thread Till Rohrmann
Hi Henry, since version 1.5 you don't need to specify the number of TaskManagers to start, because the system will figure this out. Moreover, in version 1.5.x and 1.6.x it is recommended to set the number of slots per TaskManager to 1 since we did not support multi task slot TaskManagers

[ANNOUNCE] Flink Forward San Francisco 2019 - Call for Presentations is now open

2018-10-24 Thread Till Rohrmann
Hi everybody, the Call for Presentations for Flink Forward San Francisco 2019 is now open! Apply by November 30 to share your compelling Flink use case, best practices, and latest developments with the community on April 1-2 in San Francisco, CA. Submit your proposal:

[ANNOUNCE] Weekly community update #43

2018-10-23 Thread Till Rohrmann
Dear community, this is the weekly community update thread #43. Please post any news and updates you want to share with the community to this thread. # Release vote for Flink 1.5.5 and 1.6.2 The community is currently voting on the first release candidates for Flink 1.5.5 [1] and Flink 1.6.2

[ANNOUNCE] Weekly community update #42

2018-10-17 Thread Till Rohrmann
Dear community, this is the weekly community update thread #42. Please post any news and updates you want to share with the community to this thread. # Discussion about Flink SQL integration with Hive Xuefu started a discussion about how to integrate Flink SQL with the Hive ecosystem [1]. If

Re: Flink1.6 Confirm that flink supports the plan function?

2018-10-15 Thread Till Rohrmann
Hi, 1) you currently cannot merge multiple jobs into one after they have been submitted. What you can do though, is to combine multiple jobs in your Flink program before you submit it. 2) you can pass program arguments when you submit your job. After it has been submitted, it is no longer

Re: org.apache.flink.runtime.rpc.exceptions.FencingTokenException:

2018-10-15 Thread Till Rohrmann
48b813d3a464b, > LocalRpcInvocation(requestRestAddress(Time))) sent to akka.tcp:// > flink@127.0.0.1:50010/user/dispatcher because the fencing token is null. > > > > Warm Regards, > > *Samir Chauhan* > > > > *From:* Till Rohrmann [mailto:trohrm...@apache.org] &

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-15 Thread Till Rohrmann
orNotFound: > Actor not found for: > ActorSelection[Anchor(akka.tcp://flink@taskmgr-6b59f97748-fmgwn:8070/), > Path(/user/MetricQueryService_5261ccab66b86b53a4edd64f26c1f282)]"... > > ... > > > We figured there is some problem with hostname resolution after the actor is > quaran

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-12 Thread Till Rohrmann
t 11, 2018 at 11:15 AM Abdul Qadeer > wrote: > >> Hi Till, >> >> I didn't try with newer versions as it is not possible to update the >> Flink version atm. >> If you could give any pointers for debugging that would be great. >> >> On Thu, Oct 11, 2018 a

Re: Taskmanager times out continuously for registration with Jobmanager

2018-10-11 Thread Till Rohrmann
Hi Abdul, have you tried whether this problem also occurs with newer Flink versions (1.5.4 or 1.6.1)? Cheers, Till On Thu, Oct 11, 2018 at 9:24 AM Dawid Wysakowicz wrote: > Hi Abdul, > > I've added Till and Gary to cc, who might be able to help you. > > Best, > > Dawid > > On 11/10/18 03:05,

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

2018-10-09 Thread Till Rohrmann
Hi jwatte, sorry for the inconveniences. I hope that the dicker hub images have been updated by now. Cheers, Till On Wed, Oct 10, 2018, 05:20 vino yang wrote: > Hi jwatte, > > Maybe Till can help you. > > Thanks, vino. > > jwatte 于2018年10月2日周二 上午5:30写道: > >> It turns out that the latest

[ANNOUNCE] Weekly community update #41

2018-10-09 Thread Till Rohrmann
Dear community, this is the weekly community update thread #41. Please post any news and updates you want to share with the community to this thread. # Feature freeze for Flink 1.7 The community has decided to freeze the feature development for Flink 1.7.0 on the 22nd of October [1]. # Flink

Re: flink memory management / temp-io dir question

2018-10-08 Thread Till Rohrmann
Hi Anand, spilling using the io directories is only relevant for Flink's batch processing. This happens, for example if you enable blocking data exchange where the produced data cannot be kept in memory. Moreover, it is used by many of Flink's out-of-core data structures to enable exactly this

Re: [DISCUSS] Dropping flink-storm?

2018-10-08 Thread Till Rohrmann
sus. For starters it is > contradictory; we can't *drop *flink-storm yet still *keep **some parts*. > > From my understanding we drop flink-storm completely, and put a note in > the docs that the bolt/spout wrappers of previous versions will continue to > work. > > On 08.10.2018 11

Re: [DISCUSS] Dropping flink-storm?

2018-10-08 Thread Till Rohrmann
509 for > removing flink-storm. > > On 28.09.2018 15:22, Till Rohrmann wrote: > > Hi everyone, > > > > I would like to discuss how to proceed with Flink's storm compatibility > > layer flink-strom. > > > > While working on removing Flink's legacy mode, I n

Re: org.apache.flink.runtime.rpc.exceptions.FencingTokenException:

2018-10-06 Thread Till Rohrmann
bmanager.rpc.address > > rest.address > > rest.bind-address > > jobmanager.web.address > >1. Is there anything I should be take care while setting it up? >2. How do I know which job manager is active? >3. How do I secure my cluster? > > > > Samir

Re: Flink 1.5.2 - excessive ammount of container requests, Received new/Returning excess container "flood"

2018-10-06 Thread Till Rohrmann
Hi Borys, if possible the complete logs of the JM (DEBUG log level) would be helpful to further debug the problem. Have there been recovery operations lately? Cheers, Till On Sat, Oct 6, 2018 at 11:15 AM Gary Yao wrote: > Hi Borys, > > To debug how many containers Flink is requesting, you can

Re: org.apache.flink.runtime.rpc.exceptions.FencingTokenException:

2018-10-05 Thread Till Rohrmann
er.path.root: > /flink as it is running in different server. > > Also /sharedflink is storage shared across JM and TM. Does it require to > be available in Zookeeper server also? > > Is there any special instruction for me which I should take care? > > > > Samir Chauhan >

Re: org.apache.flink.runtime.rpc.exceptions.FencingTokenException:

2018-10-05 Thread Till Rohrmann
Hi Samir, could you share the logs of the two JMs and the log where you saw the FencingTokenException with us? It looks to me as if the TM had an outdated fencing token (an outdated leader session id) with which it contacted the ResourceManager. This can happen and the TM should try to reconnect

Re: Unable to start session cluster using Docker

2018-10-05 Thread Till Rohrmann
Hi Vinay, are you referring to flink-contrib/docker-flink/docker-compose.yml? We recently fixed the command line parsing with Flink 1.5.4 and 1.6.1. Due to this, the removal of the second command line parameter intended to be introduced with 1.5.0 and 1.6.0 (see

Re: [DISCUSS] Breaking the Scala API for Scala 2.12 Support

2018-10-05 Thread Till Rohrmann
Thanks Aljoscha for starting this discussion. The described problem brings us indeed a bit into a pickle. Even with option 1) I think it is somewhat API breaking because everyone who used lambdas without types needs to add them now. Consequently, I only see two real options out of the ones you've

[ANNOUNCE] Weekly community update #40

2018-10-04 Thread Till Rohrmann
Dear community, this is the weekly community update thread #40. Please post any news and updates you want to share with the community to this thread. # Discussing feature freeze for Flink 1.7 The community is currently discussing the feature freeze for Flink 1.7 [1]. The 22nd of October is

Re: Data loss when restoring from savepoint

2018-10-04 Thread Till Rohrmann
Hi Juho, another idea to further narrow down the problem could be to simplify the job to not use a reduce window but simply a time window which outputs the window events. Then counting the input and output events should allow you to verify the results. If you are not seeing missing events, then

Re: Running job in detached mode via ClusterClient.

2018-10-02 Thread Till Rohrmann
Hi Piotr, the reason why you cannot submit multiple jobs to a job cluster is that a job cluster is only responsible for a single job. If you want to submit multiple jobs, then you need to start a session cluster. In attached mode, this is currently still possible, because under the hood, we

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Till Rohrmann
gt; > I also believe that it's a problem with a single mirror. It was not a > blocking problem for me; I just wanted you to be aware of it, in case you > have some policy regarding the management of mirrors. > > Best, > Gianluca > > > On Mon, 1 Oct 2018 at 14:27, Till Rohrma

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-10-01 Thread Till Rohrmann
strange, Till may be able to give an explanation, because >>> it is the release manager of this version. >>> >>> Thanks, vino. >>> >>> Gianluca Ortelli 于2018年9月28日周五 下午4:02写道: >>> >>>> Hi, >>>> >>>> I ju

[DISCUSS] Dropping flink-storm?

2018-09-28 Thread Till Rohrmann
Hi everyone, I would like to discuss how to proceed with Flink's storm compatibility layer flink-strom. While working on removing Flink's legacy mode, I noticed that some parts of flink-storm rely on the legacy Flink client. In fact, at the moment flink-storm does not work together with Flink's

Re: Flink 1.5.4 -- issues w/ TaskManager connecting to ResourceManager

2018-09-28 Thread Till Rohrmann
. This could however lead to problems if a user wants to set the hostname to either `local` or `cluster` via jobmanager.sh. Cheers, Till On Wed, Sep 26, 2018 at 11:24 AM Till Rohrmann wrote: > Yes, that would be a good idea. I think it should go into the release > notes. Will add it. > > On

Re: New received containers silently lost during job auto restarts

2018-09-28 Thread Till Rohrmann
Hi Paul, this looks to me like a Yarn setup problem. Could you check which value you have set for dfs.namenode.delegation.token.max-lifetime? Per default it is set to 7 days. The reason why Flink needs to access HDFS is because the binaries and the configuration files are stored there for an

Re: Help: how to get latency value comfortable in Flink1.5?

2018-09-26 Thread Till Rohrmann
Hi Rui, 1) Flink should always fetch records from Kafka if there some independent of the parallelism of the consumer. The only problem which could appear is that if you set the parallelism higher than the number of partitions, some of the source operators won't get a partition assigned. Due to

Re: Flink 1.5.4 -- issues w/ TaskManager connecting to ResourceManager

2018-09-26 Thread Till Rohrmann
tzger < > rmetz...@apache.org>: > >> Hey Jamie, >> >> we've been facing the same issue with dA Platform, when running Flink >> 1.6.1. >> I assume a lot of people will be affected by this. >> >> >> >> On Tue, Sep 25, 2018 at 11:18 PM Till

Re: Question about Window Tigger

2018-09-26 Thread Till Rohrmann
gt; > Best regards/祝好, > > Chang Liu 刘畅 > > > On 25 Sep 2018, at 23:41, Till Rohrmann wrote: > > AssignerWithPeriodicWatermarks > > >

Re: Should Queryable State Server be listening on 127.0.1.1?

2018-09-26 Thread Till Rohrmann
Hi Andrew, I think you ran into the same problem we discussed here [1]. I think it is a bug and the KvStateServerImpl or more specifically the AbstractServerBase should bind to any address. Or at least it should be configurable similar to RestOptions#BIND_ADDRESS. I've opened a JIRA issue to fix

Re: Could not find a suitable table factory for 'org.apache.flink.table.factories.DeserializationSchemaFactory' in the classpath.

2018-09-25 Thread Till Rohrmann
Hi, I've pulled in Timo, who can help you with your problem. Cheers, Till On Tue, Sep 25, 2018 at 12:02 PM clay wrote: > hi: > I am using flink's table api, I receive data from kafka, then register it > as > a table, then I use sql statement to process, and finally convert the > result >

Re: JobManager container is running beyond physical memory limits

2018-09-25 Thread Till Rohrmann
Hi, what changed between version 1.4.2 and 1.5.2 was the addition of the application level flow control mechanism which changed a bit how the network buffers are configured. This could be a potential culprit. Since you said that the container ran for some time, I'm wondering whether there is

Re: Question about Window Tigger

2018-09-25 Thread Till Rohrmann
Hi Chang Liu, maybe you could use the AssignerWithPeriodicWatermarks to build a custom watermark assigner which creates the watermarks based on the incoming events and if it detects that no events are coming, that it progresses the watermark with respect to the wall clock time. Cheers, Till On

Re: "405 HTTP method POST is not supported by this URL" is returned when POST a REST url on Flink on yarn

2018-09-25 Thread Till Rohrmann
Hi Henry, I think when running Flink on Yarn, then you must not go through the Yarn proxy. Instead you should directly send the post request to the node on which the application master runs. When starting a Flink Yarn session via yarn-session.sh, then the web interface URL is printed to stdout,

Re: Strange behavior of FsStateBackend checkpoint when local executing

2018-09-25 Thread Till Rohrmann
Hi Henry, which version of Flink are you using. If you could us provide with a working example to reproduce the problem, then I'm sure that we can figure out why it is not working as expected. Cheers, Till On Tue, Sep 25, 2018 at 8:44 AM 徐涛 wrote: > Hi All, > I use using a

Re: Flink 1.5.4 -- issues w/ TaskManager connecting to ResourceManager

2018-09-25 Thread Till Rohrmann
Hi Jamie, thanks for the update on how to fix the problem. This is very helpful for the rest of the community. The change of removing the execution mode parameter (FLINK-8696) from the start up scripts was actually released with Flink 1.5.0. That way, the host name became the 2nd parameter. By

Re: Scheduling sources

2018-09-25 Thread Till Rohrmann
Hi Averell, such a feature is currently not supported by Flink. The scheduling works by starting all sources at the same time. Depending whether it is a batch or streaming job, you either start deploying consumers once producers have produced some results or right away. Cheers, Till On Tue, Sep

Re: 1.5 Checkpoint metadata location

2018-09-25 Thread Till Rohrmann
Hi Bryant, I think if you explicitly define the StateBackend in your code (calling StreamExecutionEnvironment#setStateBackend), then you also define the checkpointing directory when calling the StateBackend's constructor. This is also the directory in which the metadata files are stored. You

Re: Flink TaskManagers do not start until job is submitted in YARN

2018-09-25 Thread Till Rohrmann
With Flink 1.5.x and 1.6.x you can put `mode: legacy` into your flink-conf.yaml and it will start the old mode. Then you have the old behaviour. What do you mean with total slots? The current number of total slots? With resource elasticity this number can of course change because if you don't

Re: Information required regarding SSL algorithms for Flink 1.5.x

2018-09-25 Thread Till Rohrmann
Flink 1.5.x supports the same set of algorithms as does Flink 1.6. However, it mainly depends on the used Java version which algorithms you can use. There are certain Java versions which don't support all cipher suites. See https://issues.apache.org/jira/browse/FLINK-9424 for example. Cheers,

Re: Flink TaskManagers do not start until job is submitted in YARN

2018-09-24 Thread Till Rohrmann
Hi Suraj, at the moment Flink's new mode does not support such a behaviour. There are plans to set a min number of running TaskManagers which won't be released. But no work has been done in this direction yet, afaik. If you want, then you can help the community with this effort. Cheers, Till On

[ANNOUNCE] Weekly community update #39

2018-09-24 Thread Till Rohrmann
Dear community, this is the weekly community update thread #39. Please post any news and updates you want to share with the community to this thread. # Flink 1.6.1 and Flink 1.5.4 released The community has released new bug fix releases: Flink 1.6.1 and Flink 1.5.4 [1, 2]. # Open source review

Re: Running Flink in Google Cloud Platform (GCP) - can Flink be truly elastic?

2018-09-24 Thread Till Rohrmann
Hi Alexander, the issue for the reactive mode, the mode which reacts to newly available resources and scales the up accordingly, is here: https://issues.apache.org/jira/browse/FLINK-10407. It does not contain a lot of details but we are actively working on publishing the corresponding design

[ANNOUNCE] Apache Flink 1.6.1 released

2018-09-20 Thread Till Rohrmann
The Apache Flink community is very happy to announce the release of Apache Flink 1.6.1, which is the first bug fix release for the Apache Flink 1.6 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming

[ANNOUNCE] Apache Flink 1.5.4 released

2018-09-20 Thread Till Rohrmann
The Apache Flink community is very happy to announce the release of Apache Flink 1.5.4, which is the fourth bug fix release for the Apache Flink 1.5 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming

Re: HA failing for 1.6.0 job cluster with docker-compose

2018-09-20 Thread Till Rohrmann
Hi Tzanko, in order to make the container entrypoint properly work with HA, we need to fix the JobID (see https://issues.apache.org/jira/browse/FLINK-10291). At the moment, we generate a new JobID for every restart of the cluster entrypoint container. Due to that the system cannot find the

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-19 Thread Till Rohrmann
e something with our pipeline itself. I'll enable debug > logs, then. > > On Tue, Sep 18, 2018 at 4:20 AM, Till Rohrmann > wrote: > >> This behavior seems very odd Julio. Could you indeed share the debug logs >> of all Flink processes in order to see why things are takin

Re: Flink 1.6.0 not allocating specified TMs in Yarn

2018-09-18 Thread Till Rohrmann
nning TM/Container. >> >> How do I restrict the number of containers/cores per container. Seems >> like -ytm is just a suggestion. I assume parallelism is within the realm >> of a single container, so I would use 5 to say I want 5 cores within one TM >> ? Is that again a

Re: Job goes down after 1/145 TMs is lost (NoResourceAvailableException)

2018-09-18 Thread Till Rohrmann
Watcher - Detected >> unreachable: [akka.tcp://flink@hello-world3-3:44607] >> 2018-09-11 16 <2018091116>:42:58,409 INFO >> org.apache.flink.yarn.YarnJobManager - Task manager akka.tcp://flink@ >> hello-world3-3/user/taskmanager terminated. >> at >> org.apache

Re: Trying to figure out why a slot takes a long time to checkpoint

2018-09-18 Thread Till Rohrmann
This behavior seems very odd Julio. Could you indeed share the debug logs of all Flink processes in order to see why things are taking so long? The checkpoint size of task #8 is twice as big as the second biggest checkpoint. But this should not cause an increase in checkpoint time of a factor of

[ANNOUNCE] Weekly community update #38

2018-09-17 Thread Till Rohrmann
Dear community, this is the weekly community update thread #38. Please post any news and updates you want to share with the community to this thread. # Release vote for Flink 1.6.1 and 1.5.4 The community is currently voting on the next bug fix releases Flink 1.6.1 [1] and Flink 1.5.4 [2].

Re: Client failed to get cancel with savepoint response

2018-09-17 Thread Till Rohrmann
Hi Paul, you're analysis is right. The JobManager does not wait for pending operation results to be properly served. See https://issues.apache.org/jira/browse/FLINK-10309 for more details. I think a way to solve it is to wait for some time if the RestServerEndpoint still has some responses to

Re: Flink 1.6.0 not allocating specified TMs in Yarn

2018-09-17 Thread Till Rohrmann
With Flink 1.6.0 it is no longer needed to specify the number of started containers (-yn 145). Flink will dynamically allocate containers. That's also the reason why you don't registered TMs without a running job. Moreover it it recommended to start every container with a single slot (no -ys 5).

Re: Unit / Integration Test Timer

2018-09-17 Thread Till Rohrmann
tor harnesses doesn't sound like the > right approach. let me know otherwise. > > Thanks, > > > On Friday, September 14, 2018, 11:42:17 AM EDT, Till Rohrmann < > trohrm...@apache.org> wrote: > > > Hi Ashish, > > how do you make sure that all of your data is not consumed

Re: Task managers run on separate nodes in a cluster

2018-09-17 Thread Till Rohrmann
>> >> Any other way of load balancing? >> Maybe by not even using the DCOS Flink package (mesos flink framework) at >> all? >> Any plans to add support for the fenzo balanced host attribute constraint? >> >> Thanks, >> M >> >> >> &g

Re: Task managers run on separate nodes in a cluster

2018-09-14 Thread Till Rohrmann
Hi Martin, Flink supports the mesos.constraints.hard.hostattribute to specify task constraints based on agent attributes [1]. I think you could use them to control the task placement. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#mesos-constraints-hard-hostattribute

Re: Problem with querying state on Flink 1.6.

2018-09-14 Thread Till Rohrmann
e code of your job or a minimal example that can reproduce it. >> >> Thanks, >> Kostas >> >> On Sep 12, 2018, at 9:59 AM, Till Rohrmann wrote: >> >> Hi Joe, >> >> what is the current problem you are facing? >> >> Cheers, >>

Re: Unit / Integration Test Timer

2018-09-14 Thread Till Rohrmann
Hi Ashish, how do you make sure that all of your data is not consumed within a fraction of the 2 seconds? For this it would be better to use event time which allows you to control how time passes. If you want to test a specific operator you could try out the One/TwoInputStreamOperatorTestHarness.

Re: Problem with querying state on Flink 1.6.

2018-09-12 Thread Till Rohrmann
ddress is based on the >> "taskmanager.hostname" configuration override and, by default, the >> RpcService address. >> >> A possible explanation is that, on Joe's machine, Java's >> `InetAddress.getLocalHost()` resolves to the loopback address. I be

Re: Exception when run flink-storm-example

2018-09-11 Thread Till Rohrmann
ttp%3A%2F%2Fmail-online.nosdn.127.net%2Fsmda6015df3a52ec22402a83bdad785acb.jpg=%5B%22%22%2C%22%22%2C%22%22%2C%22%22%2C%22%22%5D> > 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制 > On 9/11/2018 14:43,Till Rohrmann > wrote: > > Hi Hanjing, > > I thi

Re: Exception when run flink-storm-example

2018-09-11 Thread Till Rohrmann
Hi Hanjing, I think the problem is that the Storm compatibility layer only works with legacy mode at the moment. Please set `mode: legacy` in your flink-conf.yaml. I hope this will resolve the problems. Cheers, Till On Tue, Sep 11, 2018 at 7:10 AM jing wrote: > Hi vino, > Thank you very much.

[ANNOUNCE] Weekly community update #37

2018-09-10 Thread Till Rohrmann
Dear community, this is the weekly community update thread #37. Please post any news and updates you want to share with the community to this thread. # Flink Forward Berlin 2018 over Last week the community met in Berlin for the 4th edition of Flink Forward 2018 [1]. It was great to see people

Re: Job goes down after 1/145 TMs is lost (NoResourceAvailableException)

2018-09-06 Thread Till Rohrmann
remotewatcher >> detects it unreachable and the TaskManager unregisters it, I do not see any >> other activity in the JobManager with regards to reallocation etc. >> - Does the quarantining of the TaskManager not happen until the >> exit-on-fatal-akka-error is turned on

Re: Job goes down after 1/145 TMs is lost (NoResourceAvailableException)

2018-08-31 Thread Till Rohrmann
ation_1535135887442_0906. > > 2018-08-29 23:15:31,370 INFO org.apache.flink.yarn.YarnJobManager > - Stopping JobManager akka.tcp://fl...@blahxyz.sfdc.net > <http://blahabc.sfdc.net/>:1235/user/jobmanager. > > 2018-08-29 23:15:41,225 ERROR org.apache.flink.yarn.YarnJobMa

Re: Configuring Ports for Job/Task Manager Metrics

2018-08-31 Thread Till Rohrmann
Hi Deirdre, Flink does not support to control where Yarn containers are placed. This is the responsibility of Yarn as the cluster manager. In Yarn 3.1.0 it is possible to specify placement constraints for containers but also this won't fully solve your problem. Imagine that you have a single Yarn

Re: WriteTimeoutException in Cassandra sink kill the job

2018-08-30 Thread Till Rohrmann
Hi Jayant, afaik it is currently not possible to control how failures are handled in the Cassandra Sink. What would be the desired behaviour? The best thing is to open a JIRA issue to discuss potential improvements. Cheers, Till On Thu, Aug 30, 2018 at 12:15 PM Jayant Ameta wrote: > Hi, >

Re: History Server in Kubernetes

2018-08-30 Thread Till Rohrmann
Hi Encho, currently, the existing image does not support to start a HistoryServer. The reason is simply that it has not been exposed because the image contains everything needed. In order to do this, you would need to extend the docker-entrypoint.sh script with an additional history-server

Re: Job goes down after 1/145 TMs is lost (NoResourceAvailableException)

2018-08-30 Thread Till Rohrmann
Hi Subramanya, in order to help you I need a little bit of context. Which version of Flink are you running? The configuration yarn.reallocate-failed is deprecated since version Flink 1.1 and does not have an effect anymore. What would be helpful is to get the full jobmanager log from you. If the

Re: withFormat(Csv) is undefined for the type BatchTableEnvironment

2018-08-30 Thread Till Rohrmann
Hi François, as Vino said, the BatchTableEnvironment does not provide a `withFormat` method. Admittedly, the documentation does not state it too explicitly but you can only call the `withFormat` method on a table connector as indicated here [1]. If you think that you need to get the data from

Re: checkpoint timeout

2018-08-30 Thread Till Rohrmann
Hi John, which version of Flink are you using. I just tried it out with the current snapshot version and I could configure the checkpoint timeout via CheckpointConfig checkpointConfig = env.getCheckpointConfig(); checkpointConfig.setCheckpointTimeout(1337L); Could you provide us the logs and

Re: Problem with querying state on Flink 1.6.

2018-08-30 Thread Till Rohrmann
Hi Joe, it looks as if the queryable state server binds to the local loopback address. This looks like a bug to me. Could you maybe share the complete cluster entrypoint and the task manager logs with me? In the meantime you could try to do the following: Change AbstractServerBase.java:227 into

[ANNOUNCE] Weekly community update #35

2018-08-29 Thread Till Rohrmann
Dear community, this is the weekly community update thread #35. Please post any news and updates you want to share with the community to this thread. # Flink 1.7 community roadmap The Flink community started discussing which features will be included in the next upcoming major release. Join the

Re: When a jobmanager fails, it doesn't restart because it tries to restart non existing tasks

2018-08-29 Thread Till Rohrmann
tml > https://issues.apache.org/jira/browse/FLINK-10011 > > so I guess I'll have to wait until there is a fix as I have not seens any > workaround (other than removing the jobgraph from Zookeeper after > cancelling the task) > > Thanks, > > Gerard > > > On Tue, Jul 2

Re: JobGraphs not cleaned up in HA mode

2018-08-29 Thread Till Rohrmann
gets:* > *-FlinkBatchPipelineTranslator pipeline logs- (we use Apache Beam)* > > 2018-08-29 11:47:06,006 INFO > org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Submitting > job d69b67e4d28a2d244b06d3f6d661bca1 > (sicassandrawriterbeam-flink-0829114703-7d95fabd). >

Re: JobGraphs not cleaned up in HA mode

2018-08-29 Thread Till Rohrmann
08-28 11:54:10,789 INFO >>>> org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - >>>> Recovered SubmittedJobGraph(9e0a109b57511930c95d3b54574a66e3, null). >>>> >>>> Note that this log appears on the standby jobmanager immedia

Re: anybody can start flink with job mode?

2018-08-29 Thread Till Rohrmann
Great to hear :-) On Tue, Aug 28, 2018, 14:59 Hao Sun wrote: > Thanks Till for the follow up, I can run my job now. > > On Tue, Aug 28, 2018, 00:57 Till Rohrmann wrote: > >> Hi Hao, >> >> Vino is right, you need to specify the -j/--job-classname option which >

Re: [DISCUSS] Remove the slides under "Community & Project Info"

2018-08-28 Thread Till Rohrmann
+1 for removing these slides. On Mon, Aug 27, 2018 at 10:03 AM Fabian Hueske wrote: > I agree to remove the slides section. > A lot of the content is out-dated and hence not only useless but might > sometimes even cause confusion. > > Best, > Fabian > > > > Am Mo., 27. Aug. 2018 um 08:29 Uhr

Re: anybody can start flink with job mode?

2018-08-28 Thread Till Rohrmann
Hi Hao, Vino is right, you need to specify the -j/--job-classname option which specifies the job name you want to execute. Please make sure that the jar containing this class is on the class path. I recently pushed some fixes which generate a better error message than the one you've received. If

Re: Implement Joins with Lookup Data

2018-08-28 Thread Till Rohrmann
; >>>>> BTW, >>>>> >>>>> We got around bootstrap problem for similar use case using a “nohup” >>>>> topic as input stream. Our CICD pipeline currently passes an initialize >>>>> option to app IF there is a need to boo

Re: JobGraphs not cleaned up in HA mode

2018-08-28 Thread Till Rohrmann
Hi Encho, thanks a lot for reporting this issue. The problem arises whenever the old leader maintains the connection to ZooKeeper. If this is the case, then ephemeral nodes which we create to protect against faulty delete operations are not removed and consequently the new leader is not able to

Re: Job Manager killed by Kubernetes during recovery

2018-08-22 Thread Till Rohrmann
y indicated as an event. > > Thanks Vino and Till for your time! > > Bruno > > On Tue, 21 Aug 2018 at 10:21 Till Rohrmann wrote: > >> Hi Bruno, >> >> in order to debug this problem we would need a bit more information. In >> particular, the logs of the

Re: Could not cancel job (with savepoint) "Ask timed out"

2018-08-22 Thread Till Rohrmann
t another cancel-with-savepoint call is > just ignored. > > On Tue, Aug 21, 2018 at 1:18 PM Till Rohrmann > wrote: > >> Just a small addition. Concurrent cancel call will interfere with the >> cancel-with-savepoint command and directly cancel the job. So it is better >

[ANNOUNCE] Weekly community update #34

2018-08-22 Thread Till Rohrmann
Dear community, this is the weekly community update thread #33. Please post any news and updates you want to share with the community to this thread. # Flink 1.5.3 has been released The community released Flink 1.5.3 [1]. It contains more than 20 fixes and improvements. The community recommends

Re: [ANNOUNCE] Apache Flink 1.5.3 released

2018-08-21 Thread Till Rohrmann
Great news. Thanks a lot for managing the release Chesnay and to all who have contributed to this release. Cheers, Till On Tue, Aug 21, 2018 at 2:12 PM Chesnay Schepler wrote: > |The Apache Flink community is very happy to announce the release of > Apache Flink 1.5.3, which is the third bugfix

Re: Could not cancel job (with savepoint) "Ask timed out"

2018-08-21 Thread Till Rohrmann
Just a small addition. Concurrent cancel call will interfere with the cancel-with-savepoint command and directly cancel the job. So it is better to use the cancel-with-savepoint call in order to take savepoint and then cancel the job automatically. Cheers, Till On Thu, Aug 9, 2018 at 9:53 AM

Re: 答复: Best way to find the current alive jobmanager with HA mode zookeeper

2018-08-21 Thread Till Rohrmann
would be more user friendly and would eliminate the > extra step of figuring out the Job Manager address. > > Thanks, > M > > > > On Tue, Jul 31, 2018 at 3:54 PM, Till Rohrmann > wrote: > >> I think that the web ui automatically redirects to the current leader. S

Re: Job Manager killed by Kubernetes during recovery

2018-08-21 Thread Till Rohrmann
Hi Bruno, in order to debug this problem we would need a bit more information. In particular, the logs of the cluster entrypoint and your K8s deployment specification would be helpful. If you have some memory limits specified these would also be interesting to know. Cheers, Till On Sun, Aug 19,

[ANNOUNCE] Weekly community update #33

2018-08-15 Thread Till Rohrmann
Dear community, this is the weekly community update thread #33. Please post any news and updates you want to share with the community to this thread. # Flink 1.6.0 has been released The community released Flink 1.6.0 [1]. Thanks to everyone who made this release possible. # Improving tutorials

Re: flink on kubernetes

2018-08-13 Thread Till Rohrmann
Hi Mingliang, I'm currently writing the updated documentation for Flink's job cluster container entrypoint. It should be ready later today. In the meantime, you can checkout the `flink-container` module and its subdirectories. They already contain some information in the README.md on how to

<    3   4   5   6   7   8   9   10   11   12   >