Using S3 block file system

Paul Colomiets Tue, 09 Dec 2014 02:46:28 -0800

Hi,

I'm  trying to use S3 Block file system in spark, i.e. s3:// urls
(*not* s3n://). And I always get the following error:


Py4JJavaError: An error occurred while calling o3188.saveAsParquetFile.
: org.apache.hadoop.fs.s3.S3FileSystemException: Not a Hadoop S3 file.
        at 
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.checkMetadata(Jets3tFileSystemStore.java:206)
        at 
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:165)
        at 
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:221)
        at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
        at com.sun.proxy.$Proxy24.retrieveINode(Unknown Source)
        at org.apache.hadoop.fs.s3.S3FileSystem.mkdir(S3FileSystem.java:158)
        at org.apache.hadoop.fs.s3.S3FileSystem.mkdirs(S3FileSystem.java:151)
        at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
        at org.apache.hadoop.fs.s3.S3FileSystem.create(S3FileSystem.java:234)
[.. snip ..]

I believe that I must somehow initialize file system (in particular
the metadata), but I can't find out how to do it.

I use spark 1.2.0rc1 with hadoop 2.4 and Riak CS (instead of S3) if
that matters. The s3n:// protocol with same settings work.


Thanks.
-- 
Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Using S3 block file system

Reply via email to