another thing to keep in mind is that each rename() on s3 is a copy
and since we tend to move files around our compaction is like:
 - create the file in .tmp
 - copy the file to the region/family dir
 - copy the old files to the archive
..and an hfile copy is not cheap

Matteo


On Fri, Nov 7, 2014 at 6:16 PM, Andrew Purtell <[email protected]> wrote:

> And note this is any file, potentially, table descriptors, what have you.
>
> S3 isn't a filesystem, we can't pretend it is one.
>
> On Fri, Nov 7, 2014 at 10:13 AM, Andrew Purtell <[email protected]>
> wrote:
>
> > Admittedly it's been *years* since I experimented with pointing a HBase
> > root at a s3 or s3n filesystem, but my (dated) experience is it could
> take
> > some time for newly written objects to show up in a bucket. The write
> will
> > have completed and the file will be closed, but upon immediate open
> attempt
> > the client would get 500s from the S3 service. HBase expects to be able
> to
> > read back files it has just written during flushing, compaction, and the
> > like. When S3 confounds this expectation, and it did so for me quite
> often,
> > the RegionServer aborts.
> >
> >
> > On Thu, Nov 6, 2014 at 11:04 PM, Nick Dimiduk <[email protected]>
> wrote:
> >
> >> I have not created an exhaustive checklist of the DFS feature
> requirements
> >> of HBase. Would be an informative exercise and could be used to drive
> >> toward broader support. My understanding is that these are fundamental
> >> problems in the semantics of S3. As I mentioned earlier, I believe the
> >> Azue
> >> folks have been through this list in order to support HBase on their
> >> service.
> >>
> >> On Sat, Nov 1, 2014 at 4:29 PM, Khaled Elmeleegy <[email protected]>
> >> wrote:
> >>
> >> > True. HDFS has strong consistency model, whereas S3 has a more relaxed
> >> > consistency model -- eventual consistency. Do we know though, what's
> >> about
> >> > eventual consistency that breaks HBase? Is it a fundamental problem
> >> that we
> >> > can't fix or it's just an implementation issue that could be fixed. If
> >> it's
> >> > the later, then I think it's good to weight the cost of fixing this
> >> problem
> >> > against the benifit of being able to deploy on S3 directly. So any
> >> insights
> >> > there would be highly appreciated.
> >> > Best,Khaled
> >> >
> >> > > From: [email protected]
> >> > > Date: Sat, 1 Nov 2014 12:36:40 -0700
> >> > > Subject: Re: s3n with hbase
> >> > > To: [email protected]
> >> > >
> >> > > It's a reliability/stability problem. the S3 implementation of the
> FS
> >> > > doesn't provide the characteristics we rely on because S3 doesn't
> have
> >> > > these characteristics. It may be that there are improvements to be
> >> made
> >> > in
> >> > > the s3 or s3n drivers, but I believe there's a fundamental
> difference
> >> in
> >> > > the semantics of S3 compared to HDFS. Agreed that this would be
> >> > incredibly
> >> > > useful deployment model. I've heard you can run HBase directly
> against
> >> > the
> >> > > Azue equivalent of S3, if you're willing to try a different cloud.
> >> > Haven't
> >> > > tried it myself.
> >> > >
> >> > > On Sat, Nov 1, 2014 at 12:21 PM, Khaled Elmeleegy <
> [email protected]>
> >> > wrote:
> >> > >
> >> > > > So, is it a performance problem? or a correctness problem, e.g.
> >> reads
> >> > may
> >> > > > have stale values? or a reliability problem, where not strongly
> >> > consistent
> >> > > > file system can drive HBase unstable introducing failures?
> >> > > > I can live with a performance hit, given the simplicity of running
> >> > > > directly on S3. Obviously, I can't take a reliability hit though.
> >> > > >
> >> > > > > From: [email protected]
> >> > > > > Date: Fri, 31 Oct 2014 17:40:16 -0700
> >> > > > > Subject: Re: s3n with hbase
> >> > > > > To: [email protected]
> >> > > > >
> >> > > > > Please don't do this. S3 is not a strongly consistent
> filesystem.
> >> > HBase
> >> > > > > will not be happy there. Better to run on HDFS and to
> >> > snapshots/copytable
> >> > > > > backup, restore to S3.
> >> > > > >
> >> > > > > On Fri, Oct 31, 2014 at 4:53 PM, Khaled Elmeleegy <
> >> [email protected]
> >> > >
> >> > > > wrote:
> >> > > > >
> >> > > > > > Hi,
> >> > > > > >
> >> > > > > > I am trying to use hbase with s3, using s3n, but I get the
> below
> >> > > > errors,
> >> > > > > > when starting the master. I am testing this in a pseudo
> >> distributed
> >> > > > mode on
> >> > > > > > my laptop.
> >> > > > > > I've also set hbase.rootdir to s3n://
> >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com:80/root, where the
> >> > > > corresponding
> >> > > > > > bucket and directory are already created on s3. I've also set
> >> > > > > > fs.s3n.awsAccessKeyId, and fs.s3n.awsSecretAccessKey to the
> >> > appropriate
> >> > > > > > values in hbase-site.xml
> >> > > > > >
> >> > > > > > So, I must be missing something. Any advice is appreciated.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > 2014-10-31 16:47:15,312 WARN  [master:172.16.209.239:60000]
> >> > > > > > httpclient.RestS3Service: Response '/root' - Unexpected
> response
> >> > code
> >> > > > 404,
> >> > > > > > expected 200
> >> > > > > > 2014-10-31 16:47:15,349 WARN  [master:172.16.209.239:60000]
> >> > > > > > httpclient.RestS3Service: Response '/root_%24folder%24' -
> >> > Unexpected
> >> > > > > > response code 404, expected 200
> >> > > > > > 2014-10-31 16:47:15,420 WARN  [master:172.16.209.239:60000]
> >> > > > > > httpclient.RestS3Service: Response '/' - Unexpected response
> >> code
> >> > 404,
> >> > > > > > expected 200
> >> > > > > > 2014-10-31 16:47:15,420 WARN  [master:172.16.209.239:60000]
> >> > > > > > httpclient.RestS3Service: Response '/' - Received error
> response
> >> > with
> >> > > > XML
> >> > > > > > message
> >> > > > > > 2014-10-31 16:47:15,601 FATAL [master:172.16.209.239:60000]
> >> > > > > > master.HMaster: Unhandled exception. Starting shutdown.
> >> > > > > > org.apache.hadoop.fs.s3.S3Exception:
> >> > > > > > org.jets3t.service.S3ServiceException: S3 GET failed for '/'
> XML
> >> > Error
> >> > > > > > Message: <?xml version="1.0"
> >> > > > > > encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The
> >> > > > specified
> >> > > > > > bucket does not exist</Message><BucketName>
> >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com
> >> > > > > >
> >> > > >
> >> >
> >>
> </BucketName><RequestId>1589CC5DB70ED750</RequestId><HostId>cb2ZGGlNkxtf5fredweXt/wxJlAHLkioUJC86pkh0JxQfBJ1CMYoZuxHU1g+CnTB</HostId></Error>
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:245)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:181)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:158)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:151)
> >> > > > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> > Method)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> > > > > >         at java.lang.reflect.Method.invoke(Method.java:597)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >> > > > > >         at org.apache.hadoop.fs.s3native.$Proxy9.list(Unknown
> >> > Source)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:432)
> >> > > > > >         at
> >> > org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:439)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:147)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:128)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:802)
> >> > > > > >         at
> >> > org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
> >> > > > > >         at java.lang.Thread.run(Thread.java:695)
> >> > > > > > Caused by: org.jets3t.service.S3ServiceException: S3 GET
> failed
> >> > for '/'
> >> > > > > > XML Error Message: <?xml version="1.0"
> >> > > > > > encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The
> >> > > > specified
> >> > > > > > bucket does not exist</Message><BucketName>
> >> > > > > > kdiaa-hbase.s3-us-west-2.amazonaws.com
> >> > > > > >
> >> > > >
> >> >
> >>
> </BucketName><RequestId>1589CC5DB70ED750</RequestId><HostId>cb2ZGGlNkxtf5fredweXt/wxJlAHLkioUJC86pkh0JxQfBJ1CMYoZuxHU1g+CnTB</HostId></Error>
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:424)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3Service.java:686)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.jets3t.service.impl.rest.httpclient.RestS3Service.listObjectsInternal(RestS3Service.java:1083)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.jets3t.service.impl.rest.httpclient.RestS3Service.listObjectsChunkedImpl(RestS3Service.java:1053)
> >> > > > > >         at
> >> > > > > >
> >> > org.jets3t.service.S3Service.listObjectsChunked(S3Service.java:1333)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.list(Jets3tNativeFileSystemStore.java:168)
> >> > > > > >         ... 17 more
> >> > > > > > 2014-10-31 16:47:15,604 INFO  [master:172.16.209.239:60000]
> >> > > > > > master.HMaster: Aborting
> >> > > > > > 2014-10-31 16:47:15,612 DEBUG [master:172.16.209.239:60000]
> >> > > > > > master.HMaster: Stopping service threads
> >> > > > > > 2014-10-31 16:47:15,612 INFO  [master:172.16.209.239:60000]
> >> > > > > > ipc.RpcServer: Stopping server on 60000
> >> > > > > > 2014-10-31 16:47:15,612 INFO  [RpcServer.listener,port=60000]
> >> > > > > > ipc.RpcServer: RpcServer.listener,port=60000: stopping
> >> > > > > > 2014-10-31 16:47:15,619 INFO  [master:172.16.209.239:60000]
> >> > > > > > master.HMaster: Stopping infoServer
> >> > > > > > 2014-10-31 16:47:15,633 INFO  [RpcServer.responder]
> >> ipc.RpcServer:
> >> > > > > > RpcServer.responder: stopped
> >> > > > > > 2014-10-31 16:47:15,633 INFO  [RpcServer.responder]
> >> ipc.RpcServer:
> >> > > > > > RpcServer.responder: stopping
> >> > > > > > 2014-10-31 16:47:15,660 INFO  [master:172.16.209.239:60000]
> >> > > > mortbay.log:
> >> > > > > > Stopped [email protected]:60010
> >> > > > > > 2014-10-31 16:47:15,804 INFO  [master:172.16.209.239:60000]
> >> > > > > > zookeeper.ZooKeeper: Session: 0x149689a7dd80000 closed
> >> > > > > > 2014-10-31 16:47:15,804 INFO  [main-EventThread]
> >> > zookeeper.ClientCnxn:
> >> > > > > > EventThread shut down
> >> > > > > > 2014-10-31 16:47:15,804 INFO  [master:172.16.209.239:60000]
> >> > > > > > master.HMaster: HMaster main thread exiting
> >> > > > > > 2014-10-31 16:47:15,804 ERROR [main]
> master.HMasterCommandLine:
> >> > Master
> >> > > > > > exiting
> >> > > > > > java.lang.RuntimeException: HMaster Aborted
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:194)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135)
> >> > > > > >         at
> >> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> >> > > > > >         at
> >> > > > > >
> >> > > >
> >> >
> >>
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> >> > > > > >         at
> >> > > > org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803)
> >> > > > > >
> >> > > > > >
> >> > > >
> >> > > >
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Reply via email to