Re: Task manger isn’t initiating with defined values in Flink 1.11 version as part of EMR 6.1.0

2021-01-11 Thread DEEP NARAYAN Singh
& Regards, -Deep On Mon, Jan 4, 2021 at 8:12 PM DEEP NARAYAN Singh wrote: > Thanks Till, for the detailed explanation.I tried and it is working fine. > > Once again thanks for your quick response. > > Regards, > -Deep > > On Mon, 4 Jan, 2021, 2:20 PM Till Ro

Re: Task manger isn’t initiating with defined values in Flink 1.11 version as part of EMR 6.1.0

2021-01-04 Thread DEEP NARAYAN Singh
gt; > Since Flink 1.11.0 there is the option slotmanager.number-of-slots.max to > limit the upper limit of slots a cluster is allowed to allocate [1]. > > [1] https://issues.apache.org/jira/browse/FLINK-16605 > > Cheers, > Till > > On Mon, Jan 4, 2021 at 8:33 AM DEEP NARAYAN Singh &

Task manger isn’t initiating with defined values in Flink 1.11 version as part of EMR 6.1.0

2021-01-03 Thread DEEP NARAYAN Singh
Hi Guys, I’m struggling while initiating the task manager with flink 1.11.0 in AWS EMR but with older versions it is not. Let me put the full context here. *When using Flink 1.9.1 and EMR 5.29.0* To create a long running session, we used the below command. *sudo flink-yarn-session -n -s -jm

Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-07 Thread DEEP NARAYAN Singh
elimiter(String.valueOf((char) 0)); > > DataStream > >lines = env.readFile(csvFormat, path, FileProcessingMode.PROCESS_ONCE, > >-1);lines.map(value -> value).print(); > > env.execute(); > > > > > > Then you can convert the content of the csv files to

Urgent help on S3 CSV file reader DataStream Job

2020-12-07 Thread DEEP NARAYAN Singh
Hi Guys, Below is my code snippet , which read all csv files under the given folder row by row but my requirement is to read csv file at a time and convert as json which will looks like : {"A":"1","B":"3","C":"4","D":9} Csv file data format : --- *field_id,data,*

Need input on creating s3 source DataStream to filter Json file before pass to actual reader

2020-11-23 Thread DEEP NARAYAN Singh
Hi Guys, As part of flink s3 source reader can we do filter on json file name containing date and time as mentioned below : [image: image.png] I am able to read all the json files under a specific date but my use case is to read selected hour and minutes json files based on date range as shown

Re: Resource Optimization for Flink Job in AWS EMR Cluster

2020-11-05 Thread DEEP NARAYAN Singh
; > > Prasanna. > > > > > > On Thu 5 Nov, 2020, 12:40 Satyaa Dixit, wrote: > > > > > Hi Deep, > > > > > > Thanks for bringing this on table, I'm also facing a similar kind of > > issue > > > while deploying my flink Job w

Re: Resource Optimization for Flink Job in AWS EMR Cluster

2020-11-04 Thread DEEP NARAYAN Singh
Hi Guys, Sorry to bother you again.Someone could help me here for clarifying my doubt. Any help will be highly appreciated. Thanks, -Deep On Wed, Nov 4, 2020 at 6:26 PM DEEP NARAYAN Singh wrote: > Hi All, > > I am running a flink streaming job in EMR Cluster with parallelism 21 >

Resource Optimization for Flink Job in AWS EMR Cluster

2020-11-04 Thread DEEP NARAYAN Singh
Hi All, I am running a flink streaming job in EMR Cluster with parallelism 21 having 500 records per second.But still seeing cpu utilization is approximate 5-8 percent. Below is the long running session command in EMR Cluster having 3 instance of type C52xlarge(8vcore, 16 GB memory, AWS

Re: Is there any way to change operator level parallelism dynamically?

2020-10-29 Thread DEEP NARAYAN Singh
m the savepoint. The community works on a more automated way for this > problem. However, in the first versions it will use the same approach with > taking a savepoint which means that there will be a slight down time. > > Cheers, > Till > > On Wed, Oct 28, 2020 at 7:05 PM DEEP NARAYAN Sing

Is there any way to change operator level parallelism dynamically?

2020-10-28 Thread DEEP NARAYAN Singh
Hi Team, I just want quick help here. How to achieve the dynamic nature of operator level parallelism for the flink job running in AWS EMR cluster during runtime to avoid downtime and backpressure based on the incoming load. As I am very new to flink and currently using flink 1.9 version.Is there