RE: Access the data in a stream after writing to a sink

2018-07-10 Thread Teena Kappen // BPRISE
Yes.. a custom sink with the required checks seems to be the only option. From: Hequn Cheng Sent: 10 July 2018 18:23 To: Teena Kappen // BPRISE Cc: user@flink.apache.org Subject: Re: Access the data in a stream after writing to a sink Hi Teena, It seems that a sink can not output data into

Re: 答复: How to find the relation between a running job and the original jar?

2018-07-10 Thread Lasse Nedergaard
Hi Tang Thanks for the link. Yes your are rights and it works fine. But when I use the REST API for getting running jobs I can’t find any reference back to the jar used to start the job. Med venlig hilsen / Best regards Lasse Nedergaard > Den 11. jul. 2018 kl. 05.22 skrev Tang Cloud : > >

答复: How to find the relation between a running job and the original jar?

2018-07-10 Thread Tang Cloud
Hi Lasse As far as I know, if you use post /jars/upload REST API to submit your job, you can then get /jars to list your user jar just uploaded. More information can refer to https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/rest_api.html#dispatcher Apache Flink 1.5

Re: Description of Flink event time processing

2018-07-10 Thread Elias Levy
Thanks for all the comments. I've updated the document to account for the feedback. Please take a look. On Fri, Jul 6, 2018 at 2:33 PM Elias Levy wrote: > Apologies. Comments are now enabled. > > On Thu, Jul 5, 2018 at 6:09 PM Rong Rong wrote: > >> Hi Elias, >> >> Thanks for putting

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
To add one thing to Mesos question- My assumption that constraints on JobManager can work, is based on the sentence from link bleow “When running Flink with Marathon, the whole Flink cluster including the job manager will be run as Mesos tasks in the Mesos cluster.”

Re: Is Flink using even-odd versioning system

2018-07-10 Thread Bowen Li
Hi Alexander, AFAIK, Flink releases don't do that. The community has done its best to ensure every release is at its best state. Thanks, Bowen On Tue, Jul 10, 2018 at 4:54 AM Alexander Smirnov < alexander.smirn...@gmail.com> wrote: > to denote development and stable releases? >

Re: Flink dynamic scaling 1.5

2018-07-10 Thread Anil
Thanks for the reply Till. Resubmitting the job is an option. I was wondering if there's any way that Flink could be configured to detect issues like a memory issue and rescale without me submitting the job again. -- Sent from:

How to find the relation between a running job and the original jar?

2018-07-10 Thread Lasse Nedergaard
Hi. I working on a solution where I want to check if a running job use the right jar in the right version. Anyone knows if it is possible through the REST API to find information about a running job that contains the jarid or something simillary so it is possible to lookup the original jar?

RE: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Sofer, Tovi
Hi Till, group, Thank you for your response. After reading further online on Mesos – Can’t Mesos fill the requirement of running job manager in primary server? By using: “constraints”: [[“datacenter”, “CLUSTER”, “main”]] (See

Re: How to trigger a function on the state periodically?

2018-07-10 Thread anna stax
sure. I will go ahead with this for now. Thanks for your suggestions. On Mon, Jul 9, 2018 at 11:10 PM, Hequn Cheng wrote: > Hi, > It depends on how many different users. In most cases, the performance > will be fine. I think it worth to give a try. :-) > Of course, there are ways to reduce the

Re: Yarn run single job

2018-07-10 Thread Garrett Barton
AHH it works! Never occurred to me that it meant literally type in yarn-cluster. Thank you! On Tue, Jul 10, 2018 at 11:17 AM Chesnay Schepler wrote: > -m yarn-cluster switches the client into yarn mode. > > yarn-cluster is not a placeholder or anything, you have to literally type > that in. >

Re: Confusions About JDBCOutputFormat

2018-07-10 Thread wangsan
Hi Hequn, Establishing a connection for each batch write may also have idle connection problem, since we are not sure when the connection will be closed. We call flush() method when a batch is finished or snapshot state, but what if the snapshot is not enabled and the batch size not reached

Re: Filter columns of a csv file with Flink

2018-07-10 Thread françois lacombe
Hi Hequn, 2018-07-10 3:47 GMT+02:00 Hequn Cheng : > Maybe I misunderstand you. So you don't want to skip the whole file? > Yes I do By skipping the whole file I mean "throw an Exception to stop the process and inform user that file is invalid for a given reason" and not "the process goes fully

Re: Yarn run single job

2018-07-10 Thread Chesnay Schepler
-m yarn-cluster switches the client into yarn mode. yarn-cluster is not a placeholder or anything, you have to literally type that in. On 10.07.2018 17:02, Garrett Barton wrote: Greetings all, The docs say that I can skip creating a cluster and let the jobs create their own clusters on

Yarn run single job

2018-07-10 Thread Garrett Barton
Greetings all, The docs say that I can skip creating a cluster and let the jobs create their own clusters on yarn. The example given is: ./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar What I cannot figure out is what the -m option is meant for. In my opinion there is no

Re: REST API time out ( flink 1.5 ) on SP

2018-07-10 Thread Vishal Santoshi
As we on configurations let me take the liberty to ask this Does akka.jvm-exit-on-fatal-error : true have any relevance vis a vis quarantine ( it seems that we have our own gossip protocol ) and if not what other places is this used for it to be relevant ? On Tue, Jul 10, 2018 at 10:19 AM,

Re: REST API time out ( flink 1.5 ) on SP

2018-07-10 Thread Vishal Santoshi
That should do. Thanks much. On Tue, Jul 10, 2018 at 7:52 AM, Till Rohrmann wrote: > In order to configure the timeouts for the REST handlers, please use > `web.timeout`. For the client timeout use `akka.client.timeout`. > > Cheers, > Till > > On Tue, Jul 10, 2018 at 10:54 AM Vishal Santoshi <

Re: 1.5 some thing weird

2018-07-10 Thread Vishal Santoshi
Will try the setting out. Do not want to push it, but the exception can be much more descriptive :) Thanks much On Tue, Jul 10, 2018 at 7:48 AM, Till Rohrmann wrote: > Whether a Flink task should fail in case of a checkpoint error or not can > be configured via the CheckpointConfig which you

Re: Access the data in a stream after writing to a sink

2018-07-10 Thread Hequn Cheng
Hi Teena, It seems that a sink can not output data into another sink. Maybe we can implement a combined user defined sink. In the combined sink, only write to the next sink if the first write is successful. On Tue, Jul 10, 2018 at 3:23 PM, Teena Kappen // BPRISE < teena.kap...@bprise.com> wrote:

Re: Confusions About JDBCOutputFormat

2018-07-10 Thread Hequn Cheng
Hi wangsan, I agree with you. It would be kind of you to open a jira to check the problem. For the first problem, I think we need to establish connection each time execute batch write. And, it is better to get the connection from a connection pool. For the second problem, to avoid multithread

Is Flink using even-odd versioning system

2018-07-10 Thread Alexander Smirnov
to denote development and stable releases?

Re: REST API time out ( flink 1.5 ) on SP

2018-07-10 Thread Till Rohrmann
In order to configure the timeouts for the REST handlers, please use `web.timeout`. For the client timeout use `akka.client.timeout`. Cheers, Till On Tue, Jul 10, 2018 at 10:54 AM Vishal Santoshi wrote: > Aah sorry, while taking a save point without cancel, we hit this timeout > ( appears to

Re: 1.5 some thing weird

2018-07-10 Thread Till Rohrmann
Whether a Flink task should fail in case of a checkpoint error or not can be configured via the CheckpointConfig which you can access via the StreamExecutionEnvironment. You have to call `CheckpointConfig#setFailOnCheckpointingErrors(false)` to deactivate the default behaviour where the task

State sharing across trigger and evictor

2018-07-10 Thread Jayant Ameta
Hi, I'm using the GlobalWindow with a custom CountTrigger (similar to the CountTrigger provided by flink). I'm also using an evictor to remove some of the elements from the window. Is it possible to update the count when an element is evicted? For example: can I access the ReducingState used by

Support for detached mode for Flink1.5 SQL Client

2018-07-10 Thread Shivam Sharma
Hi All, Is there any way to run Flink1.5 sql-client queries in detached mode? Actually, we need to run multiple queries for different use cases and sql-client shell will open by the user on-demand. -- Shivam Sharma Data Engineer @ Goibibo Indian Institute Of Information Technology, Design and

Want to write Kafka Sink to SQL Client by Flink-1.5

2018-07-10 Thread Shivam Sharma
Hi All, We want to write Kafka Sink functionality for Flink(1.5) SQL Client. We have read the code and chalk out a rough plan for implementation. Any guidance for this implementation will be very helpful. Thanks -- Shivam Sharma Data Engineer @ Goibibo Indian Institute Of Information

Re: REST API time out ( flink 1.5 ) on SP

2018-07-10 Thread Vishal Santoshi
Aah sorry, while taking a save point without cancel, we hit this timeout ( appears to be 10 seconds ). The save point does succeed, in this case it takes roughly 13-15 seconds. Wanted to know which configuration to change to increase the time out on the REST call. It does not seem to be

Re: 1.5 some thing weird

2018-07-10 Thread Vishal Santoshi
That makes sense, what does not make sense is that the pipeline restarted. I would have imagined that an aborted chk point would not abort the pipeline. On Tue, Jul 10, 2018 at 3:16 AM, Till Rohrmann wrote: > Hi Vishal, > > it looks as if the flushing of the checkpoint data to HDFS failed due

RE: Access the data in a stream after writing to a sink

2018-07-10 Thread Teena Kappen // BPRISE
Adding to the previous question, is it possible to check if each record in a stream was written without any exceptions to a Cassandra Sink? I have to write the records to the next sink only if the first write is successful. So, replicating the streams before the write is not an option. From:

Access the data in a stream after writing to a sink

2018-07-10 Thread Teena Kappen // BPRISE
Hi, Is it possible to access the data in a stream that was written to a sink? I have a Cassandra Sink in my stream job and I have to access all the records that were written to the Cassandra sink and write it to another sink. Is there any way to do that? Regards, Teena

Re: REST API time out ( flink 1.5 ) on SP

2018-07-10 Thread Till Rohrmann
Hi Vishal, you need to give us a little bit more context in order to understand your question. Cheers, Till On Mon, Jul 9, 2018 at 10:36 PM Vishal Santoshi wrote: > java.util.concurrent.CompletionException: > akka.pattern.AskTimeoutException: Ask timed out on >

Re: 1.5 some thing weird

2018-07-10 Thread Till Rohrmann
Hi Vishal, it looks as if the flushing of the checkpoint data to HDFS failed due to some expired lease on the checkpoint file. Therefore, Flink aborted the checkpoint `chk-125` and removed it. This is the normal behaviour if Flink cannot complete a checkpoint. As you can see, afterwards, the

Re: Checkpointing in Flink 1.5.0

2018-07-10 Thread Sampath Bhat
Chesnay - Why is the absolute file check required in the RocksDBStateBackend.setDbStoragePaths(String ... paths). I think this is causing the issue. Its not related to GlusterFS or file system. The same problem can be reproduced with the following configuration on local machine. The flink

Re: high availability with automated disaster recovery using zookeeper

2018-07-10 Thread Till Rohrmann
Hi Tovi, that is an interesting use case you are describing here. I think, however, it depends mainly on the capabilities of ZooKeeper to produce the intended behavior. Flink itself relies on ZooKeeper for leader election in HA mode but does not expose any means to influence the leader election

Confusions About JDBCOutputFormat

2018-07-10 Thread wangsan
Hi all, I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink application. But I am confused with the implementation of JDBCOutputFormat. 1. The Connection was established when JDBCOutputFormat is opened, and will be used all the time. But if this connction lies idle for a long

Re: How to trigger a function on the state periodically?

2018-07-10 Thread Hequn Cheng
Hi, It depends on how many different users. In most cases, the performance will be fine. I think it worth to give a try. :-) Of course, there are ways to reduce the number of timers, for example keyBy(userId%1024), and use a MapState to store different users for the same group. On Tue, Jul 10,