I also asked the question on SO : https://stackoverflow.com/questions/27232966/what-causes-flume-with-gcs-sink-to-throw-a-outofmemoryexepction
Le 01/12/2014 15:35, Jean-Philippe Caruana a écrit : > Hi, > > I managed to write to GS from flume [1], but this is not working 100% yet: > - files are created in the expected directories, but are empty > - flume throws a java.lang.OutOfMemoryError: Java heap space: > > java.lang.OutOfMemoryError: Java heap space > at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76) > at > com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:79) > at > com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:820) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) > at > org.apache.flume.sink.hdfs.HDFSSequenceFile.open(HDFSSequenceFile.java:96) > > (complete stack trace here: http://pastebin.com/i5iSgCM3) > > Has anyone already experienced this ? > Is it a bug from google's gcs-connector-latest-hadoop2.jar ? > Where should I look to find out what's wrong ? > > My configuration looks like this: > a1.sinks.hdfs_sink.hdfs.path = > gs://bucket_name/%{env}/%{tenant}/%{type}/%Y-%m-%d > > I am running flume from Docker. > > [1] > http://stackoverflow.com/questions/27174033/what-is-the-minimal-setup-needed-to-write-to-hdfs-gs-on-google-cloud-storage-wit > > Thanks. -- Jean-Philippe Caruana http://www.barreverte.fr
