Re: Yarn run single job

2018-07-10 Thread Garrett Barton
ally type > that in. > > On 10.07.2018 17:02, Garrett Barton wrote: > > Greetings all, > The docs say that I can skip creating a cluster and let the jobs create > their own clusters on yarn. The example given is: > > ./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCoun

Yarn run single job

2018-07-10 Thread Garrett Barton
Greetings all, The docs say that I can skip creating a cluster and let the jobs create their own clusters on yarn. The example given is: ./bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar What I cannot figure out is what the -m option is meant for. In my opinion there is no

Re: Flink 1.5 Yarn Connection unexpectedly closed

2018-06-22 Thread Garrett Barton
t; > On Thu, Jun 21, 2018 at 7:43 PM Garrett Barton > wrote: > >> Actually, random thought, could yarn preemption be causing this? What is >> the failure scenario should a working task manager go down in yarn that is >> doing real work? The docs make it sound like it s

Re: Flink 1.5 Yarn Connection unexpectedly closed

2018-06-21 Thread Garrett Barton
, Jun 21, 2018 at 1:20 PM Garrett Barton wrote: > Thank you all for the reply! > > I am running batch jobs, I read in a handful of files from HDFS and output > to HBase, HDFS, and Kafka. I run into this when I have partial usage of > the cluster as the job runs. So right now I s

Re: Flink 1.5 Yarn Connection unexpectedly closed

2018-06-21 Thread Garrett Barton
>> >> I'm adding Till to this thread who's very familiar with scheduling and >> process communication. >> >> Best, Fabian >> >> 2018-06-19 0:03 GMT+02:00 Garrett Barton : >> >>> Hey all, >>> >>> My jobs that I am trying to write

Flink 1.5 Yarn Connection unexpectedly closed

2018-06-18 Thread Garrett Barton
Hey all, My jobs that I am trying to write in Flink 1.5 are failing after a few minutes. I think its because the idle task managers are shutting down, which seems to kill the client and the running job. The running job itself was still going on one of the other task managers. I get:

Re: Flink Batch Performance degradation at scale

2017-12-07 Thread Garrett Barton
17-12-07 18:35 GMT+01:00 Garrett Barton <garrett.bar...@gmail.com>: > >> Stacktrace generates every time with the following settings (tried >> different memory fractions): >> yarn-session.sh -n 400 -s 2 -tm 9200 -jm 5120 >> akka.ask.timeout: 60s >> containeriz

Re: Flink Batch Performance degradation at scale

2017-12-07 Thread Garrett Barton
> > Best, Fabian > > [1] https://cwiki.apache.org/confluence/display/FLINK/Data+ > exchange+between+tasks > > 2017-12-07 16:30 GMT+01:00 Garrett Barton <garrett.bar...@gmail.com>: > >> Thanks for the reply again, >> >> I'm currently doing runs with:

Re: Flink Batch Performance degradation at scale

2017-12-07 Thread Garrett Barton
e as part of the GroupReduce. > > Best, Fabian > > [1] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/config.html#yarn > > 2017-12-06 23:32 GMT+01:00 Garrett Barton <garrett.bar...@gmail.com>: > >> Wow thank you for the reply, you gave me a lot to

Re: Flink Batch Performance degradation at scale

2017-12-06 Thread Garrett Barton
ate. > > Best, Fabian > > [1] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/config.html#managed-memory > [2] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/execution_configuration.html > [3] http://flink.apache.org/visualizer/ > >

Re: Flink Batch Performance degradation at scale

2017-12-06 Thread Garrett Barton
unning. > The plan can be obtained from the ExecutionEnvironment by calling > env.getExecutionPlan() instead of env.execute(). > > I would also like to know how you track the progress of the program. > Are you looking at the record counts displayed in the WebUI? > > Best, > Fabian

Flink Batch Performance degradation at scale

2017-12-05 Thread Garrett Barton
I have been moving some old MR and hive workflows into Flink because I'm enjoying the api's and the ease of development is wonderful. Things have largely worked great until I tried to really scale some of the jobs recently. I have for example one etl job that reads in about 12B records at a time

Re: Classpath/ClassLoader issues

2017-10-06 Thread Garrett Barton
Fabian, Just to follow up on this, I took the patch, compiled that class and stuck it into the existing 1.3.2 jar and all is well. (I couldn't get all of flink to build correctly) Thank you! On Wed, Sep 20, 2017 at 3:53 PM, Garrett Barton <garrett.bar...@gmail.com> wrote: > Fabian, &

Re: At end of complex parallel flow, how to force end step with parallel=1?

2017-10-05 Thread Garrett Barton
ssue? > > Thanks, Fabian > > 2017-10-03 21:57 GMT+02:00 Garrett Barton <garrett.bar...@gmail.com>: > >> Gábor >> ​, >> Thank you for the reply, I gave that a go and the flow still showed >> parallel 90 for each step. Is the ui not 100% accurate perhaps

Re: At end of complex parallel flow, how to force end step with parallel=1?

2017-10-03 Thread Garrett Barton
Gévay <gga...@gmail.com> wrote: > Hi Garrett, > > You can call .setParallelism(1) on just this operator: > > ds.reduceGroup(new GroupReduceFunction...).setParallelism(1) > > Best, > Gabor > > > > On Mon, Oct 2, 2017 at 3:46 PM, Garrett Barton <

At end of complex parallel flow, how to force end step with parallel=1?

2017-10-02 Thread Garrett Barton
I have a complex alg implemented using the DataSet api and by default it runs with parallel 90 for good performance. At the end I want to perform a clustering of the resulting data and to do that correctly I need to pass all the data through a single thread/process. I read in the docs that as

Re: Classpath/ClassLoader issues

2017-09-20 Thread Garrett Barton
is actually OK, because we usually set the context >> classloader to be the user classloader before calling user code. However, >> this has not been done here. >> So, this is in fact a bug. >> >> I created this JIRA issue: https://issues.apache.org/jira >> /browse/FLINK

Re: Classpath/ClassLoader issues

2017-09-19 Thread Garrett Barton
problem than the input format. > Can you post more of the stacktrace? This would help to identify the spot > in the Flink code where the exception is thrown. > > Thanks, Fabian > > 2017-09-18 18:42 GMT+02:00 Garrett Barton <garrett.bar...@gmail.com>: > >> Hey all, >> &