another thing to keep in mind is that each rename() on s3 is a copy and since we tend to move files around our compaction is like: - create the file in .tmp - copy the file to the region/family dir - copy the old files to the archive ..and an hfile copy is not cheap
Matteo On Fri, Nov 7, 2014 at 6:16 PM, Andrew Purtell <[email protected]> wrote: > And note this is any file, potentially, table descriptors, what have you. > > S3 isn't a filesystem, we can't pretend it is one. > > On Fri, Nov 7, 2014 at 10:13 AM, Andrew Purtell <[email protected]> > wrote: > > > Admittedly it's been *years* since I experimented with pointing a HBase > > root at a s3 or s3n filesystem, but my (dated) experience is it could > take > > some time for newly written objects to show up in a bucket. The write > will > > have completed and the file will be closed, but upon immediate open > attempt > > the client would get 500s from the S3 service. HBase expects to be able > to > > read back files it has just written during flushing, compaction, and the > > like. When S3 confounds this expectation, and it did so for me quite > often, > > the RegionServer aborts. > > > > > > On Thu, Nov 6, 2014 at 11:04 PM, Nick Dimiduk <[email protected]> > wrote: > > > >> I have not created an exhaustive checklist of the DFS feature > requirements > >> of HBase. Would be an informative exercise and could be used to drive > >> toward broader support. My understanding is that these are fundamental > >> problems in the semantics of S3. As I mentioned earlier, I believe the > >> Azue > >> folks have been through this list in order to support HBase on their > >> service. > >> > >> On Sat, Nov 1, 2014 at 4:29 PM, Khaled Elmeleegy <[email protected]> > >> wrote: > >> > >> > True. HDFS has strong consistency model, whereas S3 has a more relaxed > >> > consistency model -- eventual consistency. Do we know though, what's > >> about > >> > eventual consistency that breaks HBase? Is it a fundamental problem > >> that we > >> > can't fix or it's just an implementation issue that could be fixed. If > >> it's > >> > the later, then I think it's good to weight the cost of fixing this > >> problem > >> > against the benifit of being able to deploy on S3 directly. So any > >> insights > >> > there would be highly appreciated. > >> > Best,Khaled > >> > > >> > > From: [email protected] > >> > > Date: Sat, 1 Nov 2014 12:36:40 -0700 > >> > > Subject: Re: s3n with hbase > >> > > To: [email protected] > >> > > > >> > > It's a reliability/stability problem. the S3 implementation of the > FS > >> > > doesn't provide the characteristics we rely on because S3 doesn't > have > >> > > these characteristics. It may be that there are improvements to be > >> made > >> > in > >> > > the s3 or s3n drivers, but I believe there's a fundamental > difference > >> in > >> > > the semantics of S3 compared to HDFS. Agreed that this would be > >> > incredibly > >> > > useful deployment model. I've heard you can run HBase directly > against > >> > the > >> > > Azue equivalent of S3, if you're willing to try a different cloud. > >> > Haven't > >> > > tried it myself. > >> > > > >> > > On Sat, Nov 1, 2014 at 12:21 PM, Khaled Elmeleegy < > [email protected]> > >> > wrote: > >> > > > >> > > > So, is it a performance problem? or a correctness problem, e.g. > >> reads > >> > may > >> > > > have stale values? or a reliability problem, where not strongly > >> > consistent > >> > > > file system can drive HBase unstable introducing failures? > >> > > > I can live with a performance hit, given the simplicity of running > >> > > > directly on S3. Obviously, I can't take a reliability hit though. > >> > > > > >> > > > > From: [email protected] > >> > > > > Date: Fri, 31 Oct 2014 17:40:16 -0700 > >> > > > > Subject: Re: s3n with hbase > >> > > > > To: [email protected] > >> > > > > > >> > > > > Please don't do this. S3 is not a strongly consistent > filesystem. > >> > HBase > >> > > > > will not be happy there. Better to run on HDFS and to > >> > snapshots/copytable > >> > > > > backup, restore to S3. > >> > > > > > >> > > > > On Fri, Oct 31, 2014 at 4:53 PM, Khaled Elmeleegy < > >> [email protected] > >> > > > >> > > > wrote: > >> > > > > > >> > > > > > Hi, > >> > > > > > > >> > > > > > I am trying to use hbase with s3, using s3n, but I get the > below > >> > > > errors, > >> > > > > > when starting the master. I am testing this in a pseudo > >> distributed > >> > > > mode on > >> > > > > > my laptop. > >> > > > > > I've also set hbase.rootdir to s3n:// > >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com:80/root, where the > >> > > > corresponding > >> > > > > > bucket and directory are already created on s3. I've also set > >> > > > > > fs.s3n.awsAccessKeyId, and fs.s3n.awsSecretAccessKey to the > >> > appropriate > >> > > > > > values in hbase-site.xml > >> > > > > > > >> > > > > > So, I must be missing something. Any advice is appreciated. > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > 2014-10-31 16:47:15,312 WARN [master:172.16.209.239:60000] > >> > > > > > httpclient.RestS3Service: Response '/root' - Unexpected > response > >> > code > >> > > > 404, > >> > > > > > expected 200 > >> > > > > > 2014-10-31 16:47:15,349 WARN [master:172.16.209.239:60000] > >> > > > > > httpclient.RestS3Service: Response '/root_%24folder%24' - > >> > Unexpected > >> > > > > > response code 404, expected 200 > >> > > > > > 2014-10-31 16:47:15,420 WARN [master:172.16.209.239:60000] > >> > > > > > httpclient.RestS3Service: Response '/' - Unexpected response > >> code > >> > 404, > >> > > > > > expected 200 > >> > > > > > 2014-10-31 16:47:15,420 WARN [master:172.16.209.239:60000] > >> > > > > > httpclient.RestS3Service: Response '/' - Received error > response > >> > with > >> > > > XML > >> > > > > > message > >> > > > > > 2014-10-31 16:47:15,601 FATAL [master:172.16.209.239:60000] > >> > > > > > master.HMaster: Unhandled exception. Starting shutdown. > >> > > > > > org.apache.hadoop.fs.s3.S3Exception: > >> > > > > > org.jets3t.service.S3ServiceException: S3 GET failed for '/' > XML > >> > Error > >> > > > > > Message: <?xml version="1.0" > >> > > > > > encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The > >> > > > specified > >> > > > > > bucket does not exist</Message><BucketName> > >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com > >> > > > > > > >> > > > > >> > > >> > </BucketName><RequestId>1589CC5DB70ED750</RequestId><HostId>cb2ZGGlNkxtf5fredweXt/wxJlAHLkioUJC86pkh0JxQfBJ1CMYoZuxHU1g+CnTB</HostId></Error> > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:245) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:181) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:158) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:151) > >> > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > >> > Method) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >> > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > >> > > > > > at org.apache.hadoop.fs.s3native.$Proxy9.list(Unknown > >> > Source) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:432) > >> > > > > > at > >> > org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:439) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:147) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:128) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:802) > >> > > > > > at > >> > org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615) > >> > > > > > at java.lang.Thread.run(Thread.java:695) > >> > > > > > Caused by: org.jets3t.service.S3ServiceException: S3 GET > failed > >> > for '/' > >> > > > > > XML Error Message: <?xml version="1.0" > >> > > > > > encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The > >> > > > specified > >> > > > > > bucket does not exist</Message><BucketName> > >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com > >> > > > > > > >> > > > > >> > > >> > </BucketName><RequestId>1589CC5DB70ED750</RequestId><HostId>cb2ZGGlNkxtf5fredweXt/wxJlAHLkioUJC86pkh0JxQfBJ1CMYoZuxHU1g+CnTB</HostId></Error> > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:424) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3Service.java:686) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.jets3t.service.impl.rest.httpclient.RestS3Service.listObjectsInternal(RestS3Service.java:1083) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.jets3t.service.impl.rest.httpclient.RestS3Service.listObjectsChunkedImpl(RestS3Service.java:1053) > >> > > > > > at > >> > > > > > > >> > org.jets3t.service.S3Service.listObjectsChunked(S3Service.java:1333) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:168) > >> > > > > > ... 17 more > >> > > > > > 2014-10-31 16:47:15,604 INFO [master:172.16.209.239:60000] > >> > > > > > master.HMaster: Aborting > >> > > > > > 2014-10-31 16:47:15,612 DEBUG [master:172.16.209.239:60000] > >> > > > > > master.HMaster: Stopping service threads > >> > > > > > 2014-10-31 16:47:15,612 INFO [master:172.16.209.239:60000] > >> > > > > > ipc.RpcServer: Stopping server on 60000 > >> > > > > > 2014-10-31 16:47:15,612 INFO [RpcServer.listener,port=60000] > >> > > > > > ipc.RpcServer: RpcServer.listener,port=60000: stopping > >> > > > > > 2014-10-31 16:47:15,619 INFO [master:172.16.209.239:60000] > >> > > > > > master.HMaster: Stopping infoServer > >> > > > > > 2014-10-31 16:47:15,633 INFO [RpcServer.responder] > >> ipc.RpcServer: > >> > > > > > RpcServer.responder: stopped > >> > > > > > 2014-10-31 16:47:15,633 INFO [RpcServer.responder] > >> ipc.RpcServer: > >> > > > > > RpcServer.responder: stopping > >> > > > > > 2014-10-31 16:47:15,660 INFO [master:172.16.209.239:60000] > >> > > > mortbay.log: > >> > > > > > Stopped [email protected]:60010 > >> > > > > > 2014-10-31 16:47:15,804 INFO [master:172.16.209.239:60000] > >> > > > > > zookeeper.ZooKeeper: Session: 0x149689a7dd80000 closed > >> > > > > > 2014-10-31 16:47:15,804 INFO [main-EventThread] > >> > zookeeper.ClientCnxn: > >> > > > > > EventThread shut down > >> > > > > > 2014-10-31 16:47:15,804 INFO [master:172.16.209.239:60000] > >> > > > > > master.HMaster: HMaster main thread exiting > >> > > > > > 2014-10-31 16:47:15,804 ERROR [main] > master.HMasterCommandLine: > >> > Master > >> > > > > > exiting > >> > > > > > java.lang.RuntimeException: HMaster Aborted > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:194) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135) > >> > > > > > at > >> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > >> > > > > > at > >> > > > org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803) > >> > > > > > > >> > > > > > > >> > > > > >> > > > > >> > > >> > > >> > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
