> Improvements in Hadoop's S3 client, and in the implementation of S3 itself > could help to fix throughput problems and mask transient error problems. Not always being able to read back an object that has been written is deadly. Having the S3 client cache written data for a while can help but isn't a complete solution because the RS can fail and its regions will be reassigned to another RS... who then might not be able to read the data. A region might bounce around the cluster taking exceptions on open for a while. This availability problem could eventually stall all clients. To address this, you could implement a distributed write-behind cache for S3, but is it worth the effort and added complexity?
> Andy - are you (or other HBase experts) aware if HBase would have problems > with > a HFile store that exhibits variable latency? Specifically, what about > scenarios > where most HFile reads come back in milliseconds, but suddenly there is one > that > takes a few hundred milliseconds (or more). If we are talking about S3, then I have observed latencies in the thousands of milliseconds. The scenario you describe won't cause a service availability problem on the HBase side, we can tolerate a wide range of read and write latencies, but this would impact performance obviously -- whenever reads affected by a latency spike are for blocks not in block cache, then the client will see it. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) ----- Original Message ----- > From: Jagane Sundar <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Sent: Wednesday, December 28, 2011 12:02 PM > Subject: RE: Writing the WAL to a different filesystem from the HFiles > > Hello Andy, > > >>> No, definitely not full object reads, we use HDFS positioned reads, > which allow us to request, within a gigabyte plus store file, much smaller > byte >>> ranges (e.g. 64 KB), and receive back only the requested data. We can > "seek" around the file. > > Ahh. This is good to know. HTTP range requests should work for this mode of > operation. I will take a look at Hadoop's S3 FileSystemStore implementation > and see if it uses HTTP range requests. > >>> Aside from several IMHO showstopper performance problems, the shortest > answer is HBase often wants to promptly read back store files it has >>> written, and S3 is too eventual often enough (transient 404s or 500s) > to preclude reliable operation. > > Hmm. OK. The potential performance problems are worrisome. > > Improvements in Hadoop's S3 client, and in the implementation of S3 itself > could help to fix throughput problems and mask transient error problems. > There > are rumors of a version of the Hadoop S3 client implementation that use > parallel > reads to greatly improve throughput. > > Andy - are you (or other HBase experts) aware if HBase would have problems > with > a HFile store that exhibits variable latency? Specifically, what about > scenarios > where most HFile reads come back in milliseconds, but suddenly there is one > that > takes a few hundred milliseconds (or more). > > Thanks, > Jagane >
