Hi Thomas, It appears Flink couldn't pick up the Hadoop configuration. Did you set the environment variables HADOOP_CONF_DIR or HADOOP_HOME?
Best, Max On Sun, Nov 8, 2015 at 7:52 PM, Thomas Götzinger <m...@simplydevelop.de> wrote: > Sorry for Confusing, > > the flink cluster throws following stack trace.. > > org.apache.flink.client.program.ProgramInvocationException: The program > execution failed: Failed to submit job 29a2a25d49aa0706588ccdb8b7e81c6c > (Flink Java Job at Sun Nov 08 18:50:52 UTC 2015) > at org.apache.flink.client.program.Client.run(Client.java:413) > at org.apache.flink.client.program.Client.run(Client.java:356) > at org.apache.flink.client.program.Client.run(Client.java:349) > at > org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63) > at > org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:789) > at de.fraunhofer.iese.proopt.Template.run(Template.java:112) > at de.fraunhofer.iese.proopt.Main.main(Main.java:8) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353) > at org.apache.flink.client.program.Client.run(Client.java:315) > at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:582) > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:288) > at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:878) > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:920) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Failed to > submit job 29a2a25d49aa0706588ccdb8b7e81c6c (Flink Java Job at Sun Nov 08 > 18:50:52 UTC 2015) > at > org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:594) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:190) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: org.apache.flink.runtime.JobException: Creating the input splits > caused an error: No file system found with scheme s3n, referenced in file > URI 's3n://big-data-benchmark/pavlo/text/tiny/rankings'. > at > org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:162) > at > org.apache.flink.runtime.executiongraph.ExecutionGraph.attachJobGraph(ExecutionGraph.java:469) > at > org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$submitJob(JobManager.scala:534) > ... 19 more > Caused by: java.io.IOException: No file system found with scheme s3n, > referenced in file URI 's3n://big-data-benchmark/pavlo/text/tiny/rankings'. > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:247) > at org.apache.flink.core.fs.Path.getFileSystem(Path.java:309) > at > org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:447) > at > org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:57) > at > org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:146) > ... 21 more > > -- > > Viele Grüße > > > > Thomas Götzinger > > Freiberuflicher Informatiker > > > > Glockenstraße 2a > > D-66882 Hütschenhausen OT Spesbach > > Mobil: +49 (0)176 82180714 > > Privat: +49 (0) 6371 954050 > > mailto:m...@simplydevelop.de > > epost: thomas.goetzin...@epost.de > > > > > > On 08.11.2015, at 19:06, Thomas Götzinger <m...@simplydevelop.de> wrote: > > HI Fabian, > > thanks for reply. I use a karamel receipt to install flink on ec2.Currently > I am using flink-0.9.1-bin-hadoop24.tgz. > > In that file the NativeS3FileSystem is included. First I’ve tried it with > the standard karamel receipt on github hopshadoop/flink-chef but it’s on > Version 0.9.0 and the S3NFileSystem is not included. > So I forked the github project by goetzingert/flink-chef > Although the class file is include the application throws a > ClassNotFoundException for the class above. > In my Project I add the conf/core-site.xml > > <property> > <name>fs.s3n.impl</name> > <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value> > </property> > <property> > <name>fs.s3n.awsAccessKeyId</name> > <value>….</value> > </property> > <property> > <name>fs.s3n.awsSecretAccessKey</name> > <value>...</value> > </property> > > — > I also tried to use the programmatic configuration > > XMLConfiguration config = new XMLConfiguration(configPath); > > env = ExecutionEnvironment.getExecutionEnvironment(); > Configuration configuration = GlobalConfiguration.getConfiguration(); > configuration.setString("fs.s3.impl", > "org.apache.hadoop.fs.s3native.NativeS3FileSystem"); > configuration.setString("fs.s3n.awsAccessKeyId", “.."); > configuration.setString("fs.s3n.awsSecretAccessKey”,”../"); > configuration.setString("fs.hdfs.hdfssite",Template.class.getResource("/conf/core-site.xml").toString()); > GlobalConfiguration.includeConfiguration(configuration); > > > Any Idea why the class is not included in classpath? Is there another script > to setup flink on ec2 cluster? > > When will flink 0.10 be released? > > > Regards > > > > Thomas Götzinger > > Freiberuflicher Informatiker > > > > Glockenstraße 2a > > D-66882 Hütschenhausen OT Spesbach > > Mobil: +49 (0)176 82180714 > > Privat: +49 (0) 6371 954050 > > mailto:m...@simplydevelop.de > > epost: thomas.goetzin...@epost.de > > > > > > On 29.10.2015, at 09:47, Fabian Hueske <fhue...@gmail.com> wrote: > > Hi Thomas, > > until recently, Flink provided an own implementation of a S3FileSystem which > wasn't fully tested and buggy. > We removed that implementation and are using now (in 0.10-SNAPSHOT) Hadoop's > S3 implementation by default. > > If you want to continue using 0.9.1 you can configure Flink to use Hadoop's > implementation. See this answer on StackOverflow and the linked email thread > [1]. > If you switch to the 0.10-SNAPSHOT version (which will be released in a few > days as 0.10.0), things become a bit easier and Hadoop's implementation is > used by default. The documentation shows how to configure your access keys > [2] > > Please don't hesitate to ask if something is unclear or not working. > > Best, Fabian > > [1] > http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3 > [2] > https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html > > 2015-10-29 9:35 GMT+01:00 Thomas Götzinger <m...@simplydevelop.de>: >> >> Hello Flink Team, >> >> We at IESE Fraunhofer are evaluating Flink for a project and I'm a bit >> frustrated in the moment. >> >> I've wrote a few testcases with the flink API and want to deploy them to >> an Flink EC2 Cluster. I setup the cluster using the >> karamel receipt which was adressed in the following video >> >> >> https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg >> >> The setup works fine and the hello-flink app could be run. But afterwards >> I want to copy some data from s3 bucket to the local ec2 hdfs cluster. >> >> The hadoop fs -ls s3n.... works as well as cat,... >> But if I want to copy the data with distcp the command freezes, and does >> not respond until a timeout. >> >> After trying a few things I gave up and start another solution. I want to >> access the s3 Bucket directly with flink and import it using a small flink >> programm which just reads s3 and writes to local hadoop. This works fine >> locally, but on cluster the S3NFileSystem class is missing (ClassNotFound >> Exception) althoug it is included in the jar file of the installation. >> >> >> I forked the chef receipt and updated to flink 0.9.1 but the same issue. >> >> Is there another simple script to install flink with hadoop on an ec2 >> cluster and working s3n filesystem? >> >> >> >> Freelancer >> >> on Behalf of Fraunhofer IESE Kaiserslautern >> >> >> -- >> >> Viele Grüße >> >> >> >> Thomas Götzinger >> >> Freiberuflicher Informatiker >> >> >> >> Glockenstraße 2a >> >> D-66882 Hütschenhausen OT Spesbach >> >> Mobil: +49 (0)176 82180714 >> >> Homezone: +49 (0) 6371 735083 >> >> Privat: +49 (0) 6371 954050 >> >> mailto:m...@simplydevelop.de >> >> epost: thomas.goetzin...@epost.de > > > >