flink can't read hdfs namenode logical url

2017-10-19 Thread 邓俊华
hi,
I start yarn-ssession.sh on yarn, but it can't read hdfs logical url. It always 
connect to hdfs://master:8020, it should be 9000, my hdfs defaultfs is 
hdfs://master.I have config the YARN_CONF_DIR and HADOOP_CONF_DIR, it didn't 
work.Is it a bug? i use flink-1.3.0-bin-hadoop27-scala_2.10
2017-10-20 11:00:05,395 DEBUG org.apache.hadoop.ipc.Client  
- IPC Client (1035144464) connection to startdt/173.16.5.215:8020 
from admin: closed
2017-10-20 11:00:05,398 ERROR org.apache.flink.yarn.YarnApplicationMasterRunner 
- YARN Application Master initialization failed
java.net.ConnectException: Call From spark3/173.16.5.216 to master:8020 failed 
on connection exception: java.net.ConnectException: Connection refused; For 
more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)



Testing GlobalWindows

2017-10-19 Thread Philip Doctor
I have a GlobalWindow with a custom trigger (I leave windows open for a 
variable length of time depending on how much data I have vs the expected 
amount, so I’m manipulating triggerContext.registerProcessingTimeTimer()).

When I emit data into my data stream, the flink execution environment appears 
to halt after the test data is exhausted but before my GlobalWidow is triggered.

I tried changing my trigger to wait zero seconds on window full, but that just 
appears to have made my test racy where sometimes the global window triggers 
and calls apply (so the test passes) and sometimes the environment appears to 
halt first.

Is there a way for me to leave the execution environment running for a few 
seconds after all of my data is emitted? Or is there a good way for me to test 
this? So far my only solution has been to use env.fromCollection() in flink, 
and then pass a custom iterator class where the iterator.next() itself hangs 
before delivering the last value for Thread.sleep(10_000) (the last value I 
insert become untested garbage). That gives the window a chance to trigger and 
I always get the correct results (huzzah) but it's super hacky.

Any advice here is greatly appreciated.

Thanks,
Phil



Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-19 Thread r. r.
Thanks, Piotr
but my app code is self-contained in a fat-jar with maven-shade, so why would 
the class path affect this?

by shade commons-compress do you mean :

it doesn't have effect either

as a last resort i may try to rebuild Flink to use 1.14, but don't want to go 
there yet =/


Best regards






 > Оригинално писмо 

 >От: Piotr Nowojski pi...@data-artisans.com

 >Относно: Re: java.lang.NoSuchMethodError and dependencies problem

 >До: "r. r." 

 >Изпратено на: 19.10.2017 20:04



 
> I’m not 100% sure, so treat my answer with a grain of salt.
 
> 
 
> I think when you start the cluster this way, dependencies (some? all?) are 
> being loaded to the class path before loading user’s application. At that 
> point, it doesn’t matter whether you have excluded commons-compress 1.4.1 in 
> yours application pom.xml. I’m not sure if this is solvable in some way, or 
> not.
 
> 
 
> Maybe as a walk around, you could shade commons-compress usages in your 
> pom.xml?
 
> 
 
> Piotr Nowojski
 
> 
 
> > On 19 Oct 2017, at 17:36, r. r.  wrote:
 
> > 
 
> > flink is started with bin/start-local.sh
 
> > 
 
> > there is no classpath variable in the environment; 
> > flink/lib/flink-dist_2.11-1.3.2.jar contains commons-compress, still it 
> > should be overridden by the dependencyManagement directive
 
> > 
 
> > here is the stacktrace:
 
> > 
 
> > The program finished with the following exception:
 
> > 
 
> > org.apache.flink.client.program.ProgramInvocationException: The program 
> > execution failed: Job execution failed.
 
> > at 
> > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
 
> > at 
> > org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
 
> > at 
> > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
 
> > at 
> > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:73)
 
> > at com.foot.semantic.flink.PipelineJob.main(PipelineJob.java:73)
 
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
> > at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 
> > at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
> > at java.lang.reflect.Method.invoke(Method.java:497)
 
> > at 
> > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
 
> > at 
> > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
 
> > at 
> > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
 
> > at 
> > org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
 
> > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
 
> > at 
> > org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
 
> > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
 
> > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
 
> > at 
> > org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
 
> > at java.security.AccessController.doPrivileged(Native Method)
 
> > at javax.security.auth.Subject.doAs(Subject.java:422)
 
> > at 
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 
> > at 
> > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
 
> > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
 
> > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> > execution failed.
 
> > at 
> > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:933)
 
> > at 
> > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
 
> > at 
> > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
 
> > at 
> > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
 
> > at 
> > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
 
> > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
 
> > at 
> > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
 
> > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 
> > at 
> > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 
> > at 
> > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 
> > at 
> > 

Re: problem scale Flink job on YARN

2017-10-19 Thread Lei Chen
Hi Aljoscha,

I'm using version 1.3.0 and changing job-wide parallelism.

Lei

On Thu, Oct 19, 2017 at 9:47 AM, Aljoscha Krettek 
wrote:

> Hi Lei,
>
> Which version of Flink would that be? I'm guessing >= 1.3.x? In Flink 1.1
> the hash of an operator was tied to the parallelism but starting with 1.2
> that shouldn't happen anymore.
>
> Also, are you changing the parallelism job-wide or are there operators
> with differing parallelism? For example, could there be a source with
> parallelism 1 and an operator that had parallelism 1 after that which now
> has a different parallelism?
>
> Best,
> Aljoscha
>
>
> On 16. Oct 2017, at 06:28, Lei Chen  wrote:
>
> Hi,
>
> We're trying to implement some module to help autoscale our pipeline which
> is built  with Flink on YARN. According to the document, the suggested
> procedure seems to be:
>
> 1. cancel job with savepoint
> 2. start new job with increased YARN TM number and parallelism.
>
> However, step 2 always gave error
>
> Caused by: java.lang.IllegalStateException: Failed to rollback to
> savepoint hdfs://10.106.238.14:/tmp/savepoint-767421-20907d234655. Cannot
> map savepoint state for operator 37dfe905df17858e07858039ce3d8ae1 to the
> new program, because the operator is not available in the new program. If
> you want to allow to skip this, you can set the --allowNonRestoredState
> option on the CLI.
> at org.apache.flink.runtime.checkpoint.savepoint.SavepointLoade
> r.loadAndValidateSavepoint(SavepointLoader.java:130)
> at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.re
> storeSavepoint(CheckpointCoordinator.java:1140)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$
> apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobMa
> nager.scala:1386)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$
> apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.s
> cala:1372)
> at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$
> apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.s
> cala:1372)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.lifte
> dTree1$1(Future.scala:24)
> at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(F
> uture.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.
> exec(AbstractDispatcher.scala:397)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(
> ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPoo
> l.java:1979)
> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinW
> orkerThread.java:107)
>
> The procedure worked fine if parallelism was not changed.
>
> Also want to mention that I didn't manually specify OperatorID in my job. The
> document does mentioned manually OperatorID assignment is suggested, just
> curious is that mandatory in my case to fix the problem I'm seeing, given
> that my program doesn't change at all so the autogenerated operatorID
> should be unchanged after parallelism increase?
>
> thanks,
> Lei
>
>
>


Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-19 Thread Piotr Nowojski
I’m not 100% sure, so treat my answer with a grain of salt.

I think when you start the cluster this way, dependencies (some? all?) are 
being loaded to the class path before loading user’s application. At that 
point, it doesn’t matter whether you have excluded commons-compress 1.4.1 in 
yours application pom.xml. I’m not sure if this is solvable in some way, or not.

Maybe as a walk around, you could shade commons-compress usages in your pom.xml?

Piotr Nowojski

> On 19 Oct 2017, at 17:36, r. r.  wrote:
> 
> flink is started with bin/start-local.sh
> 
> there is no classpath variable in the environment; 
> flink/lib/flink-dist_2.11-1.3.2.jar contains commons-compress, still it 
> should be overridden by the dependencyManagement directive
> 
> here is the stacktrace:
> 
> The program finished with the following exception:
> 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Job execution failed.
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
> at 
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
> at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:73)
> at com.foot.semantic.flink.PipelineJob.main(PipelineJob.java:73)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:933)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;
> at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detectArchiveFormat(ZipContainerDetector.java:114)
> at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:85)
> at 
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:115)
> at 
> com.foot.semantic.gafe.flink.services.TikaService.parse(TikaService.java:83)
> at 
> com.foot.semantic.gafe.flink.services.TikaService.parse(TikaService.java:59)
> at com.foot.semantic.flink.uima.TikaReader.process(TikaReader.java:137)
> at 

Re: Watermark on connected stream

2017-10-19 Thread Piotr Nowojski
With aitozi we have a hat trick oO
 
> On 19 Oct 2017, at 17:08, Tzu-Li (Gordon) Tai  wrote:
> 
> Ah, sorry for the duplicate answer, I didn’t see Piotr’s reply. Slight delay 
> on the mail client.
> 
> 
> On 19 October 2017 at 11:05:01 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org 
> ) wrote:
> 
>> Hi Kien,
>> 
>> The watermark of an operator with multiple inputs will be determined by the 
>> current minimum watermark across all inputs.
>> 
>> Cheers,
>> Gordon
>> 
>> 
>> On 19 October 2017 at 8:06:11 PM, Kien Truong (duckientru...@gmail.com 
>> ) wrote:
>> 
>>> Hi, 
>>> 
>>> If I connect two stream with different watermark, how are the watermark 
>>> of the resulting stream determined ? 
>>> 
>>> 
>>> Best regards, 
>>> 
>>> Kien



Re: problem scale Flink job on YARN

2017-10-19 Thread Aljoscha Krettek
Hi Lei,

Which version of Flink would that be? I'm guessing >= 1.3.x? In Flink 1.1 the 
hash of an operator was tied to the parallelism but starting with 1.2 that 
shouldn't happen anymore.

Also, are you changing the parallelism job-wide or are there operators with 
differing parallelism? For example, could there be a source with parallelism 1 
and an operator that had parallelism 1 after that which now has a different 
parallelism?

Best,
Aljoscha

> On 16. Oct 2017, at 06:28, Lei Chen  wrote:
> 
> Hi, 
> 
> We're trying to implement some module to help autoscale our pipeline which is 
> built  with Flink on YARN. According to the document, the suggested procedure 
> seems to be:
> 
> 1. cancel job with savepoint
> 2. start new job with increased YARN TM number and parallelism. 
> 
> However, step 2 always gave error 
> 
> Caused by: java.lang.IllegalStateException: Failed to rollback to savepoint 
> hdfs://10.106.238.14:/tmp/savepoint-767421-20907d234655. Cannot map savepoint 
> state for operator 37dfe905df17858e07858039ce3d8ae1 to the new program, 
> because the operator is not available in the new program. If you want to 
> allow to skip this, you can set the --allowNonRestoredState option on the CLI.
>   at 
> org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:130)
>   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1140)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 
> The procedure worked fine if parallelism was not changed. 
> 
> Also want to mention that I didn't manually specify OperatorID in my job. The 
> document does mentioned manually OperatorID assignment is suggested, just 
> curious is that mandatory in my case to fix the problem I'm seeing, given 
> that my program doesn't change at all so the autogenerated operatorID should 
> be unchanged after parallelism increase?
> 
> thanks,
> Lei



Re: Watermark on connected stream

2017-10-19 Thread aitozi
Hi, 

You can see the field in AbstractStreamOperator 

// We keep track of watermarks from both inputs, the combined input is the
minimum
// Once the minimum advances we emit a new watermark for downstream
operators
private long combinedWatermark = Long.MIN_VALUE;

it will chose the Min watermark from the connect stream 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-19 Thread r. r.
flink is started with bin/start-local.sh

there is no classpath variable in the environment; 
flink/lib/flink-dist_2.11-1.3.2.jar contains commons-compress, still it should 
be overridden by the dependencyManagement directive

here is the stacktrace:

The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Job execution failed.
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
    at 
org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
    at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:73)
    at com.foot.semantic.flink.PipelineJob.main(PipelineJob.java:73)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
    at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
    at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
    at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
    at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
    at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
    at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
    at 
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
    at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
    at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:933)
    at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
    at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:876)
    at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
    at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NoSuchMethodError: 
org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;
    at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectArchiveFormat(ZipContainerDetector.java:114)
    at 
org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:85)
    at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:115)
    at 
com.foot.semantic.gafe.flink.services.TikaService.parse(TikaService.java:83)
    at 
com.foot.semantic.gafe.flink.services.TikaService.parse(TikaService.java:59)
    at com.foot.semantic.flink.uima.TikaReader.process(TikaReader.java:137)
    at com.foot.semantic.flink.uima.TikaReader.getNext(TikaReader.java:82)
    at 
org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:175)
    at com.foot.semantic.flink.PipelineJob$1.map(PipelineJob.java:63)
    at com.foot.semantic.flink.PipelineJob$1.map(PipelineJob.java:37)
    at 
org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41)
    at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:528)
    at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:503)
    at 

Re: Flink CEP State Change Pattern

2017-10-19 Thread Tzu-Li (Gordon) Tai
Hi Philip!

I’m looping in Kostas to this thread. He might be able to provide some insights 
for your question.

Cheers,
Gordon

On 14 October 2017 at 8:54:45 PM, Philip Limbeck (philiplimb...@gmail.com) 
wrote:

Hi!  

I am quite new to Flink CEP and try to define a state change pattern  
with it. This means that only discrete changes in the event stream  
should be detected i.e.  

a a b b - triggers a single change from a to b  

Considering b the "bad" state, I would like to additionally recognize  
the state change from null (i.e. non-existing state) to b:  

b b b - triggers a single change from null to b  

Initially, I tried to model this behavior as follows:  

Pattern<...>begin("first")  
.where()  
.optional()  
.next("second")  
.where()  

However, since state b can be detected without state a, having a state  
change from a to b results in two identified patterns:  
a b  
and  
b  

Additionally, when the "bad" state b is already given, every  
subsequent b event will detect a new b pattern which is also not what  
I want.  

When the "optional" keyword is omitted, obviously no initial b events  
are detected.  

I know that Flink 1.4.0 will support AFTER_MATCH_SKIP, which I assume  
would aid in this situation as a single b event will not take part in  
two computation states.  

Being currently stuck with 1.3.2 is there a workaround using Flink CEP  
to enable this behavior?  

I am aware of the fact that this behavior is much easier to build  
using plain Flink.  

Thank you for your support, any help is appreciated.  

Best  
Philip  


Re: Watermark on connected stream

2017-10-19 Thread Tzu-Li (Gordon) Tai
Ah, sorry for the duplicate answer, I didn’t see Piotr’s reply. Slight delay on 
the mail client.


On 19 October 2017 at 11:05:01 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) 
wrote:

Hi Kien,

The watermark of an operator with multiple inputs will be determined by the 
current minimum watermark across all inputs.

Cheers,
Gordon


On 19 October 2017 at 8:06:11 PM, Kien Truong (duckientru...@gmail.com) wrote:

Hi, 

If I connect two stream with different watermark, how are the watermark 
of the resulting stream determined ? 


Best regards, 

Kien 


Re: Accumulator with Elasticsearch Sink

2017-10-19 Thread Tzu-Li (Gordon) Tai
Hi Sendoh,

That sounds like a reasonable metric to add directly to the Elasticsearch 
connector.
Could you perhaps write a comment on that in 
https://issues.apache.org/jira/browse/FLINK-7697?

Cheers,
Gordon

On 19 October 2017 at 8:57:23 PM, Sendoh (unicorn.bana...@gmail.com) wrote:

Hi Flink users,  

Did someone use accumulator with Elasticsearch Sink? So we can better  
compare the last timestamps in the sink and the last timestamps in  
Elasticsearch, in order to see how long does it take from the Elasticsearch  
sink to Elasticsearch.  

Best,  

Sendoh  



--  
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ 
 


Re: Watermark on connected stream

2017-10-19 Thread Tzu-Li (Gordon) Tai
Hi Kien,

The watermark of an operator with multiple inputs will be determined by the 
current minimum watermark across all inputs.

Cheers,
Gordon


On 19 October 2017 at 8:06:11 PM, Kien Truong (duckientru...@gmail.com) wrote:

Hi, 

If I connect two stream with different watermark, how are the watermark 
of the resulting stream determined ? 


Best regards, 

Kien 


Re: Watermark on connected stream

2017-10-19 Thread Piotr Nowojski
Hi,

As you can see in 
org.apache.flink.streaming.api.operators.AbstractStreamOperator#processWatermark1
 it takes a minimum of both of the inputs.

Piotrek
 
> On 19 Oct 2017, at 14:06, Kien Truong  wrote:
> 
> Hi,
> 
> If I connect two stream with different watermark, how are the watermark of 
> the resulting stream determined ?
> 
> 
> Best regards,
> 
> Kien
> 



Re: java.lang.NoSuchMethodError and dependencies problem

2017-10-19 Thread Piotr Nowojski
Hi,

What is the full stack trace of the error?
Are you sure that there is no commons-compresss somewhere in the classpath 
(like in the lib directory)? How are you running your Flink cluster?

Piotrek

> On 19 Oct 2017, at 13:34, r. r.  wrote:
> 
> Hello
> I have a job that runs an Apache Tika pipeline and it fails with "Caused by: 
> java.lang.NoSuchMethodError: 
> org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;"
> 
> Flink includes commons-compress 1.4.1, while Tika needs 1.14. 
> I also have Apache Avro in the project with commons-compress at 1.8.1, so I 
> force 1.14 with 
> 
> 
> 
> 
> org.apache.commons
> commons-compress
> 1.14
> 
> 
> 
> 
> this seems to work as mvn dependency:tree -Ddetail=true only shows 1.14 and 
> after purge, the local maven repo also only contains 1.14
> 
> yet, after i deploy the job and it reads an Avro package from kafka and 
> passes it to Tika, it fails with the error above, which leads me to think it 
> somehow uses commons-compress at a version prior to 1.14, because method 
> 'detect' is not present in older versions
> 
> I excluded/included it from the fat-jar
> org.apache.commons:commons-compress
> still the same problem
> 
> thanks for any hints!
> 
> 



Execution Failed (cluster setup Flink+Hadoop), Task Manager was lost/killed

2017-10-19 Thread Oleksandra Levchenko
Hi, 

I am running Flink batch job on Standalone Cluster (16 nodes), on top of 
Hadoop. 
The chain looks like:

DataSet1 = env.readTextFile (csv on hdfs)
.map
.flatMap
.groupBy
.reduce
.map
.writeAsCsv (DataSet 1)

DataSet2 = env.readTextFile 
.map
.flatMap

env.readCsvFile (DataSet1)
DataSet1.flatJoin(DataSet2)
.groupBy
.reduce
.filter
.count

The job finishes successfully with DataSource ~ 24.3 G 
As I scaled to a larger data  (244.3 G ) it fails   with: 

java.lang.Exception: TaskManager was lost/killed: 
3c0d5310e30f7c52eae95ff97bee85e2 @ grisou-46.nancy.grid5000.fr (dataPort=51359)
at 
org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:217)
at 
org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:533)
at 
org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:192)
at 
org.apache.flink.runtime.instance.Instance.markDead(Instance.java:167)
at 
org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:212)
at 
org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(JobManager.scala:1228)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:1131)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at 
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:125)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at 
akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:44)
at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369)
at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501)
at akka.actor.ActorCell.invoke(ActorCell.scala:486)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


Next job submit  after cluster restart produce this ERROR, I guess caused by 
TaskManager lost:
 


java.lang.IllegalStateException: Update task on TaskManager 
12c673eb2681eb4f1d1cb000e561f1c5 @ grisou-48.nancy.grid5000.fr (dataPort=42326) 
failed due to:
at 
org.apache.flink.runtime.executiongraph.Execution$8.apply(Execution.java:1076)
at 
org.apache.flink.runtime.executiongraph.Execution$8.apply(Execution.java:1073)
at 
org.apache.flink.runtime.concurrent.impl.FlinkFuture$3.recover(FlinkFuture.java:201)
at akka.dispatch.Recover.internal(Future.scala:268)
at akka.dispatch.japi$RecoverBridge.apply(Future.scala:184)
at akka.dispatch.japi$RecoverBridge.apply(Future.scala:182)
at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:216)
at scala.util.Try$.apply(Try.scala:192)
at scala.util.Failure.recover(Try.scala:216)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:324)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:324)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on 

Accumulator with Elasticsearch Sink

2017-10-19 Thread Sendoh
Hi Flink users,

Did someone use accumulator with Elasticsearch Sink? So we can better
compare the last timestamps in the sink and the last timestamps in
Elasticsearch, in order to see how long does it take from the Elasticsearch
sink to Elasticsearch.

Best,

Sendoh



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Watermark on connected stream

2017-10-19 Thread Kien Truong

Hi,

If I connect two stream with different watermark, how are the watermark 
of the resulting stream determined ?



Best regards,

Kien



java.lang.NoSuchMethodError and dependencies problem

2017-10-19 Thread r. r.
Hello
I have a job that runs an Apache Tika pipeline and it fails with "Caused by: 
java.lang.NoSuchMethodError: 
org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;"

Flink includes commons-compress 1.4.1, while Tika needs 1.14. 
I also have Apache Avro in the project with commons-compress at 1.8.1, so I 
force 1.14 with 

    
    
    
    org.apache.commons
    commons-compress
    1.14
    
    
    

this seems to work as mvn dependency:tree -Ddetail=true only shows 1.14 and 
after purge, the local maven repo also only contains 1.14

yet, after i deploy the job and it reads an Avro package from kafka and passes 
it to Tika, it fails with the error above, which leads me to think it somehow 
uses commons-compress at a version prior to 1.14, because method 'detect' is 
not present in older versions

I excluded/included it from the fat-jar
org.apache.commons:commons-compress
still the same problem

thanks for any hints!




Re: FlinkCEP: pattern application on a KeyedStream

2017-10-19 Thread Kostas Kloudas
Hi Federico,

If I understand your question correctly, then yes, the application of a Pattern 
on a keyed stream 
is similar to the application of a map function.

It will search for the pattern on each per-key stream of data.
So there will be state (buffer with partial matches, queued elements, etc) for 
every active key.

Cheers,
Kostas

> On Oct 19, 2017, at 11:55 AM, Federico D'Ambrosio 
>  wrote:
> 
> Hi all,
> 
> I was wondering if it is correct to assume the application of a pattern on a 
> KeyedStream similar to the application, e.g., of a MapFunction when it comes 
> to state.
> 
> For example, the following
> 
> val pattern = ...
> val keyedStream = stream.keyBy("id")
> 
> val patternKeyedStream = CEP.pattern(pattern, keyedStream)
> 
> val anotherKeyedStream = patternKeyedStream.select(...)
> 
> should only check the pattern on each single partition value.
> 
> Am I correct in assuming this, or I have misunderstood CEP functioning?
> 
> -- 
> Federico D'Ambrosio



FlinkCEP: pattern application on a KeyedStream

2017-10-19 Thread Federico D'Ambrosio
Hi all,

I was wondering if it is correct to assume the application of a pattern on
a KeyedStream similar to the application, e.g., of a MapFunction when it
comes to state.

For example, the following

val pattern = ...
val keyedStream = stream.keyBy("id")

val patternKeyedStream = CEP.pattern(pattern, keyedStream)

val anotherKeyedStream = patternKeyedStream.select(...)

should only check the pattern on each single partition value.

Am I correct in assuming this, or I have misunderstood CEP functioning?

-- 
Federico D'Ambrosio


Re: Set heap size

2017-10-19 Thread Piotr Nowojski
Hi,

Just log into the machine and check it’s memory consumption using htop or a 
similar tool under the load. Remember about subtracting Flink’s memory usage 
and and file system cache.

Piotrek

> On 19 Oct 2017, at 10:15, AndreaKinn  wrote:
> 
> About task manager heap size Flink doc says:
> 
> ... If the cluster is exclusively running Flink, the total amount of
> available memory per machine minus some memory for the operating system
> (maybe 1-2 GB) is a good value
> 
> But my nodes have 2GB of ram each. There isn't an empirical count to set ram
> memory or a way to estimate the ram used by the OS?
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/



Re: SLF4j logging system gets clobbered?

2017-10-19 Thread Piotr Nowojski
Hi,

What versions of Flink/logback are you using?

Have you read this: 
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/best_practices.html#use-logback-when-running-flink-out-of-the-ide--from-a-java-application
 

 ?
Maybe this is an issue of having multiple logging tools and their 
configurations on the class path?

Piotrek

> On 18 Oct 2017, at 20:32, Jared Stehler  
> wrote:
> 
> I’m having an issue where I’ve got logging setup and functioning for my 
> flink-mesos deployment, and works fine up to a point (the same point every 
> time) where it seems to fall back to “defaults” and loses all of my 
> configured filtering.
> 
> 2017-10-11 21:37:17.454 [flink-akka.actor.default-dispatcher-17] INFO  
> o.a.f.m.runtime.clusterframework.MesosFlinkResourceManager  - TaskManager 
> taskmanager-8 has started.
> 2017-10-11 21:37:17.454 [flink-akka.actor.default-dispatcher-16] INFO  
> org.apache.flink.runtime.instance.InstanceManager  - Registered TaskManager 
> at ip-10-80-54-201 
> (akka.tcp://fl...@ip-10-80-54-201.us-west-2.compute.internal:31014/user/taskmanager
>  
> )
>  as 697add78bd00fe7dc6a7aa60bc8d75fb. Current number of registered hosts is 
> 39. Current number of alive task slots is 39.
> 2017-10-11 21:37:18.820 [flink-akka.actor.default-dispatcher-17] INFO  
> org.apache.flink.runtime.instance.InstanceManager  - Registered TaskManager 
> at ip-10-80-54-201 
> (akka.tcp://fl...@ip-10-80-54-201.us-west-2.compute.internal:31018/user/taskmanager
>  
> )
>  as a6cff0f18d71aabfb3b112f5e2c36c2b. Current number of registered hosts is 
> 40. Current number of alive task slots is 40.
> 2017-10-11 21:37:18.821 [flink-akka.actor.default-dispatcher-17] INFO  
> o.a.f.m.runtime.clusterframework.MesosFlinkResourceManager  - TaskManager 
> taskmanager-00010 has started.
> 2017-10-11 
> 21:39:04,371:6171(0x7f67fe9cd700):ZOO_WARN@zookeeper_interest@1570: Exceeded 
> deadline by 13ms
> 
> — here is where it turns over into default pattern layout ---
> 21:39:05.616 [nioEventLoopGroup-5-6] INFO  o.a.flink.runtime.blob.BlobClient 
> - Blob client connecting to akka://flink/user/jobmanager 
> 
> 
> 21:39:09.322 [nioEventLoopGroup-5-6] INFO  o.a.flink.runtime.client.JobClient 
> - Checking and uploading JAR files
> 21:39:09.322 [nioEventLoopGroup-5-6] INFO  o.a.flink.runtime.blob.BlobClient 
> - Blob client connecting to akka://flink/user/jobmanager 
> 
> 21:39:09.788 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.m.r.c.MesosJobManager - Submitting job 005b570ff2866023aa905f2bc850f7a3 
> (Sa-As-2b-Submission-Join-V3 := demos-demo500--data-canvas-2-sa-qs-as-v3).
> 21:39:09.789 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.m.r.c.MesosJobManager - Using restart strategy 
> FailureRateRestartStrategy(failuresInterval=12 msdelayInterval=1000 
> msmaxFailuresPerInterval=3) for 005b570ff2866023aa905f2bc850f7a3.
> 21:39:09.789 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.r.e.ExecutionGraph - Job recovers via failover strategy: full graph 
> restart
> 21:39:09.790 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.m.r.c.MesosJobManager - Running initialization on master for job 
> Sa-As-2b-Submission-Join-V3 := demos-demo500--data-canvas-2-sa-qs-as-v3 
> (005b570ff2866023aa905f2bc850f7a3).
> 21:39:09.790 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.m.r.c.MesosJobManager - Successfully ran initialization on master in 0 
> ms.
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] WARN  
> o.a.f.configuration.Configuration - Config uses deprecated configuration key 
> 'high-availability.zookeeper.storageDir' instead of proper key 
> 'high-availability.storageDir'
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.c.GlobalConfiguration - Loading configuration property: 
> mesos.failover-timeout, 60
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.c.GlobalConfiguration - Loading configuration property: 
> mesos.initial-tasks, 1
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.c.GlobalConfiguration - Loading configuration property: 
> mesos.maximum-failed-tasks, -1
> 21:39:09.791 [flink-akka.actor.default-dispatcher-4] INFO  
> o.a.f.c.GlobalConfiguration - Loading configuration property: 
> mesos.resourcemanager.framework.role, '*'
> 
> The reason this is a vexing issue is that the app master then proceeds to 
> dump megabytes of " o.a.f.c.GlobalConfiguration - Loading configuration 
> property:” messages into the log, and I’m unable to filter them out.
> 
> My logback config is:
> 
> 
> 
> 
> 
> %d{-MM-dd HH:mm:ss.SSS} [%thread] %-5level 
> %logger{60} %X{sourceThread} - %msg%n
> 
> 
> 
> 
> 
> ERROR
> 
> 
> 
>  level="OFF" />
> 
>  

Set heap size

2017-10-19 Thread AndreaKinn
About task manager heap size Flink doc says:

... If the cluster is exclusively running Flink, the total amount of
available memory per machine minus some memory for the operating system
(maybe 1-2 GB) is a good value

But my nodes have 2GB of ram each. There isn't an empirical count to set ram
memory or a way to estimate the ram used by the OS?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Flink REST API async?

2017-10-19 Thread Francisco Gonzalez Barea
Hello,

Going back on this thread, quick question: Will this be supported in next Flink 
version? If not, when is it expected to be included?

Regards


On 8 Aug 2017, at 15:46, Aljoscha Krettek 
> wrote:

I quickly talked to Till about this. The new JobManager, once FLIP-6 is 
implemented, will have a new REST endpoint that allows submitting a JobGraph 
directly. With this, we no longer have to execute the user main() method in the 
WebRuntimeMonitor (which is a component that the current JobManager process 
loads to serve the web frontend and the REST interface).

This should solve the problem, but unfortunately it doesn't solve your current 
problem.

Best,
Aljoscha
On 8. Aug 2017, at 10:26, Francisco Gonzalez Barea 
> wrote:

Aha ok… Thanks for your answer Eron.

Regards


On 7 Aug 2017, at 19:04, Eron Wright 
> wrote:

When you submit a program via the REST API, the main method executes inside the 
JobManager process.Unfortunately a static variable is used to establish the 
execution environment that the program obtains from 
`ExecutionEnvironment.getExecutionEnvironment()`.  From the stack trace it 
appears that two main methods are executing simultaneously and one is 
corrupting the other.

On Mon, Aug 7, 2017 at 8:21 AM, Francisco Gonzalez Barea 
> wrote:
Hi there!

We are doing some POCs submitting jobs remotely to Flink. We tried with Flink 
CLI and now we´re testing the Rest API.

So the point is that when we try to execute a set of requests in an async way 
(using CompletableFutures) only a couple of them run successfully. For the rest 
we get the exception copied at the end of the email.

Do you know the reason for this?

Thanks in advance!!
Regards,

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error.
  at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
  at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
  at 
org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:80)
 at 
org.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:318)
 at 
org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:72)
 at 
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleJsonRequest(JarRunHandler.java:61)
 at 
org.apache.flink.runtime.webmonitor.handlers.AbstractJsonRequestHandler.handleRequest(AbstractJsonRequestHandler.java:41)
 at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:109)
 at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:97)
 at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
 at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
 at 
io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
 at 
io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
 at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at 
org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:159)
 at 
org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
 at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
 at