Re: Flink on EC"

Thomas Götzinger Sun, 08 Nov 2015 10:06:55 -0800

HI Fabian,

thanks for reply. I use a karamel receipt to install flink on ec2.Currently I 
am using flink-0.9.1-bin-hadoop24.tgz 
<http://apache.mirrors.spacedump.net/flink/flink-0.9.1/flink-0.9.1-bin-hadoop24.tgz>.


 In that file the NativeS3FileSystem is included. First I’ve tried it with the 
standard karamel receipt on github hopshadoop/flink-chef 
<https://github.com/hopshadoop/flink-chef> but it’s on Version 0.9.0 and the 
S3NFileSystem is not included.
So I forked the github project by goetzingert/flink-chef
Although the class file is include the application throws a 
ClassNotFoundException for the class above.
In my Project I add the conf/core-site.xml

  <property>
    <name>fs.s3n.impl</name>
    <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
  </property>
  <property>
    <name>fs.s3n.awsAccessKeyId</name>
    <value>….</value>
  </property>
  <property>
    <name>fs.s3n.awsSecretAccessKey</name>
    <value>...</value>
  </property>

— 
I also tried to use the programmatic configuration 

                XMLConfiguration config = new XMLConfiguration(configPath);

                env = ExecutionEnvironment.getExecutionEnvironment();
                Configuration configuration = 
GlobalConfiguration.getConfiguration();
                configuration.setString("fs.s3.impl", 
"org.apache.hadoop.fs.s3native.NativeS3FileSystem");
                configuration.setString("fs.s3n.awsAccessKeyId", “..");
                configuration.setString("fs.s3n.awsSecretAccessKey”,”../");
                
configuration.setString("fs.hdfs.hdfssite",Template.class.getResource("/conf/core-site.xml").toString());
                GlobalConfiguration.includeConfiguration(configuration);


Any Idea why the class is not included in classpath? Is there another script to 
setup flink on ec2 cluster?

When will flink 0.10 be released? 



Regards

 
Thomas Götzinger

Freiberuflicher Informatiker

 
Glockenstraße 2a

D-66882 Hütschenhausen OT Spesbach

Mobil: +49 (0)176 82180714

Privat: +49 (0) 6371 954050

mailto:m...@simplydevelop.de <mailto:thomas.goetzin...@kajukin.de>
epost: thomas.goetzin...@epost.de <mailto:thomas.goetzin...@epost.de>




> On 29.10.2015, at 09:47, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Hi Thomas,
> 
> until recently, Flink provided an own implementation of a S3FileSystem which 
> wasn't fully tested and buggy.
> We removed that implementation and are using now (in 0.10-SNAPSHOT) Hadoop's 
> S3 implementation by default.
> 
> If you want to continue using 0.9.1 you can configure Flink to use Hadoop's 
> implementation. See this answer on StackOverflow and the linked email thread 
> [1].
> If you switch to the 0.10-SNAPSHOT version (which will be released in a few 
> days as 0.10.0), things become a bit easier and Hadoop's implementation is 
> used by default. The documentation shows how to configure your access keys [2]
> 
> Please don't hesitate to ask if something is unclear or not working.
> 
> Best, Fabian
> 
> [1] 
> http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3 
> <http://stackoverflow.com/questions/32959790/run-apache-flink-with-amazon-s3>
> [2] 
> https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html
>  
> <https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html>
> 
> 2015-10-29 9:35 GMT+01:00 Thomas Götzinger <m...@simplydevelop.de 
> <mailto:m...@simplydevelop.de>>:
> Hello Flink Team,
> 
> We at IESE Fraunhofer are evaluating Flink for a project and I'm a bit 
> frustrated in the moment. 
> 
> I've wrote a few testcases with the flink API and want to deploy them to an 
> Flink EC2 Cluster. I setup the cluster using the 
> karamel receipt which was adressed in the following video 
> 
> https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg
>  
> <https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=video&cd=1&cad=rja&uact=8&ved=0CDIQtwIwAGoVChMIy86Tq6rQyAIVR70UCh0IRwuJ&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dm_SkhyMV0to&usg=AFQjCNGKUzFv521yg-OTy-1XqS2-rbZKug&bvm=bv.105454873,d.bGg>
> 
> The setup works fine and the hello-flink app could be run. But afterwards I 
> want to copy some data from s3 bucket to the local ec2 hdfs cluster. 
> 
> The hadoop fs -ls s3n.... works as well as cat,...
> But if I want to copy the data with distcp the command freezes, and does not 
> respond until a timeout.
> 
> After trying a few things I gave up and start another solution. I want to 
> access the s3 Bucket directly with flink and import it using a small flink 
> programm which just reads s3 and writes to local hadoop. This works fine 
> locally, but on cluster the S3NFileSystem class is missing (ClassNotFound 
> Exception) althoug it is included in the jar file of the installation. 
> 
> 
> I forked the chef receipt and updated to flink 0.9.1 but the same issue.
> 
> Is there another simple script to install flink with hadoop on an ec2 cluster 
> and working s3n filesystem?
> 
> 
> 
> 
> Freelancer 
> 
> on Behalf of Fraunhofer IESE Kaiserslautern
> 
> 
> -- 
> Viele Grüße
> 
>  
> Thomas Götzinger
> 
> Freiberuflicher Informatiker
> 
>  
> Glockenstraße 2a
> 
> D-66882 Hütschenhausen OT Spesbach
> 
> Mobil: +49 (0)176 82180714
> 
> Homezone: +49 (0) 6371 735083
> 
> Privat: +49 (0) 6371 954050
> 
> mailto:m...@simplydevelop.de <mailto:thomas.goetzin...@kajukin.de>
> epost: thomas.goetzin...@epost.de <mailto:thomas.goetzin...@epost.de>

Re: Flink on EC"

Reply via email to