Re: Flink Task Allocation on Yarn

2018-11-06 Thread vino yang
Hi Marvin, Can you tell me why you want to do this? Thanks, vino. Marvin777 于2018年10月30日周二 上午10:24写道: > Hi vino, > > The current generation environment is on yarn. We do not want to increase > the operation and maintenance cost of the Standalone mode. > > Is there any other way to make better

Re: Live configuration change

2018-11-06 Thread Ning Shi
> On Nov 6, 2018, at 4:22 PM, Elias Levy wrote: > > Also note that there is a pending PR to allow the Cassandra sink to back > pressure, so that the cluster does not get overwhelmed. Yes, I’ve been following the development on that pull request. Unfortunately, we have to go live very soon

Re: AvroInputFormat Serialisation Issue

2018-11-06 Thread Vinay Patil
Hi, I am facing a similar issue today with Flink 1.6.0 - AvroOutputFormat AvroOutputFormat tuple2AvroOutputFormat = new AvroOutputFormat<>( new Path(""), GenericRecord.class); testDataSet .map(new GenerateGenericRecord())

Per job cluster doesn't shut down after the job is canceled

2018-11-06 Thread Paul Lam
Hi, I’m using Flink 1.5.3, and I’ve seen several times that the detached YARN cluster doesn’t shut down after the job is canceled successfully. The only errors I found in jobmanager’s log are as below (the second one appears multiple times): ``` 2018-11-07 09:48:38,663 WARN

Re: Always trigger calculation of a tumble window in Flink SQL

2018-11-06 Thread yinhua.dai
Hi Piotr, Can you elaborate more on the solution with the custom operator? I don't think there will be any records from the SQL query if no input data in coming in within the time window even if we convert the result to a datastream. -- Sent from:

Re: Live configuration change

2018-11-06 Thread Elias Levy
Also note that there is a pending PR to allow the Cassandra sink to back pressure, so that the cluster does not get overwhelmed. On Tue, Nov 6, 2018 at 12:46 PM Ning Shi wrote: > > for rate limiting, would quota at Kafka brokers help? > > Thanks, Steven. This looks very promising. I'll try it

Re: Live configuration change

2018-11-06 Thread Ning Shi
> for rate limiting, would quota at Kafka brokers help? Thanks, Steven. This looks very promising. I'll try it out. -- Ning

Re: Live configuration change

2018-11-06 Thread Steven Wu
for rate limiting, would quota at Kafka brokers help? On Tue, Nov 6, 2018 at 10:29 AM Ning Shi wrote: > On Tue, Nov 06, 2018 at 07:44:50PM +0200, Nicos Maris wrote: > > Ning can you provide another example except for rate limiting? > > Our main use case and concern is rate limiting because

Re: Live configuration change

2018-11-06 Thread Ning Shi
On Tue, Nov 06, 2018 at 07:44:50PM +0200, Nicos Maris wrote: > Ning can you provide another example except for rate limiting? Our main use case and concern is rate limiting because without it we could potentially overwhelm downstream systems (Cassandra) when the job plays catch up or replay

Flink Web UI does not show specific exception messages when job submission fails

2018-11-06 Thread Luis Gustavo Oliveira Silva
Hello, I was using Flink 1.4.2 and when submiting jobs through the Web UI, I could see exceptions that would help me debug jobs, such as: We're sorry, something went wrong. The server responded with: > > java.util.concurrent.CompletionException: > org.apache.flink.util.FlinkException: Could not

Re: Live configuration change

2018-11-06 Thread Nicos Maris
Ning can you provide another example except for rate limiting? On Tue, Nov 6, 2018, 6:20 PM Piotr Nowojski Hi, > > Sorry but none that I’m aware of. As far as I know, the only way to > dynamically configure Kafka source would be for you to copy and modify it’s > code. > > Piotrek > > > On 6 Nov

Re: Understanding checkpoint behavior

2018-11-06 Thread PranjalChauhan
Thank you for your response Piotr. I plan to upgrade to Flink 1.5.x early next year. Two follow-up questions for now. 1. " When operator snapshots are taken, there are two parts: the synchronous and the asynchronous parts. " I understand that when the operator snapshot is being taken, the

Re: Using FlinkKinesisConsumer through a proxy

2018-11-06 Thread Vijay Balakrishnan
Hi Gordon, This still didn't work :( Tried a few combinations with: kinesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX + "proxyDomain", "..."); inesisConsumerConfig.setProperty(AWSUtil.AWS_CLIENT_CONFIG_PREFIX + "proxyHost", "http://.com;);

Re: Parallelize an incoming stream into 5 streams with the same data

2018-11-06 Thread Vijay Balakrishnan
Cool, thanks! Hequn. I will try that approach. Vijay On Thu, Nov 1, 2018 at 8:18 PM Hequn Cheng wrote: > Hi Vijay, > > > I was hoping to groupBy(key._1,key._2) etc and then do a tumblingWindow > operation on the KeyedStream and then perform group operation on the > resultant set to get total

Re: Always trigger calculation of a tumble window in Flink SQL

2018-11-06 Thread Piotr Nowojski
Hi, You might come up with some magical self join that could do the trick - join/window join the the aggregation result with self and then aggregate it again. I don’t know if that’s possible (probably you would need to write custom aggregate function) and would be inefficient. It will be

[ANNOUNCE] Weekly community update #45

2018-11-06 Thread Till Rohrmann
Dear community, this is the weekly community update thread #45. Please post any news and updates you want to share with the community to this thread. # First release candidate for Flink 1.7.0 The community has published the first release candidate for Flink 1.7.0 [0]. Please help the community

Kubernetes Job Cluster - Checkpointing with Parallelism > 1

2018-11-06 Thread Thad Truman
Hi all, We are trying to configure checkpointing (RocksDb) for flink job clusters in k8s. As described here we have a parallelism value that is used as the -Dparallelism.default arg in the job manager

Re: Understanding checkpoint behavior

2018-11-06 Thread Piotr Nowojski
Hi, Checkpoint duration sync, that’s only the time taken for the “synchronous” part of taking a snapshot of your operator. Your 11m time probably comes from the fact that before this snapshot, checkpoint barrier was stuck somewhere in your pipeline for that amount of time processing some

Re: Live configuration change

2018-11-06 Thread Piotr Nowojski
Hi, Sorry but none that I’m aware of. As far as I know, the only way to dynamically configure Kafka source would be for you to copy and modify it’s code. Piotrek > On 6 Nov 2018, at 15:19, Ning Shi wrote: > > In the job I'm implementing, there are a couple of configuration > variables that I

Run a Flink job: REST/ binary client

2018-11-06 Thread Flavio Pompermaier
Hi to all, I'm using Flink 1.3.2. If executing a job using bin/flink run everything goes well. If executing using REST service of job manager (/jars:jarid/run) the job writes to the sink but fails to return on env.execute() and all the code after it is not executed. Is this a known issue? Was it

Re: Flink CEP Watermark Exception

2018-11-06 Thread Austin Cawley-Edwards
Hi Dawid, Just back in the office. The platform we run on recently announced Flink 1.6.0 support, so we upgraded and haven't seen this problem arise again yet! We believe it could have been the `equals` method falsely matching different records in rare instances, though the upgrade to Flink 1.6.0

Live configuration change

2018-11-06 Thread Ning Shi
In the job I'm implementing, there are a couple of configuration variables that I wnat to change at runtime, such as rate limit at the Kafka source. I know it's possible to use a control stream and join it with the normal stream to configure things in certain operators, but this doesn't work for

Re: Report failed job submission

2018-11-06 Thread Flavio Pompermaier
Any idea about how to address this issue? On Tue, Oct 16, 2018 at 11:32 AM Flavio Pompermaier wrote: > Hi to all, > which is the correct wat to report back to the user a failure from a job > submission in FLink? > If everything is OK the job run API returns the job id but what if there > are

Stopping a streaming app from its own code : behaviour change from 1.3 to 1.6

2018-11-06 Thread LINZ, Arnaud
Hello, In flink 1.3, I was able to make a clean stop of a HA streaming application just by ending the source “run()” method (with an ending condition). I try to update my code to flink 1.6.2, but that is no longer working. Even if there are no sources and no item to process, the cluster

Re: Split one dataset into multiple

2018-11-06 Thread Fabian Hueske
You have to define a common type, like an n-ary Either type and return that from your source / operator. The resulting DataSet can be consumed by multiple FlatmapFunctions, each extracting and forwarding one of the the result types. Cheers, Fabian Am Di., 6. Nov. 2018 um 10:40 Uhr schrieb madan

Re: Split one dataset into multiple

2018-11-06 Thread madan
Hi Vino, Thank you for suggestions. In my case I am using DataSet since data is limited, and split/select is not available on DataSet api. I doubt even hash partition might not work for me. By doing hash partition, I do not know which partition is having which entity data (Dept, Emp in my