Hi, I'm trying to use S3 Block file system in spark, i.e. s3:// urls (*not* s3n://). And I always get the following error:
Py4JJavaError: An error occurred while calling o3188.saveAsParquetFile. : org.apache.hadoop.fs.s3.S3FileSystemException: Not a Hadoop S3 file. at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.checkMetadata(Jets3tFileSystemStore.java:206) at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:165) at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:221) at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy24.retrieveINode(Unknown Source) at org.apache.hadoop.fs.s3.S3FileSystem.mkdir(S3FileSystem.java:158) at org.apache.hadoop.fs.s3.S3FileSystem.mkdirs(S3FileSystem.java:151) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815) at org.apache.hadoop.fs.s3.S3FileSystem.create(S3FileSystem.java:234) [.. snip ..] I believe that I must somehow initialize file system (in particular the metadata), but I can't find out how to do it. I use spark 1.2.0rc1 with hadoop 2.4 and Riak CS (instead of S3) if that matters. The s3n:// protocol with same settings work. Thanks. -- Paul --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org