2016-04-29 11:35 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: > bq. AsyncFSOutput will be in HDFS-3.0 > > Is there HDFS JIRA for the above ? Can you share the number ? > I have not filed a new one but there are bunch of related issues already, such as this one https://issues.apache.org/jira/browse/HDFS-916
> > bq. Just wrap FSDataOutputStream to make it act like an asynchronous output > > Can you be a bit more specific ? > HBase currently works with WASB and Azure Data Lake. Does the above mean > their performance would suffer ? > Yes, the performance will suffer... The fallback implementation is not aim to get a good performance, just for compatibility with any FileSystem implementation. > > On Thu, Apr 28, 2016 at 8:30 PM, 张铎 <palomino...@gmail.com> wrote: > > > Inline comments. > > Thanks, > > > > 2016-04-29 10:57 GMT+08:00 Sean Busbey <bus...@cloudera.com>: > > > > > I am nervous about having default out-of-the-box new HBase users > reliant > > on > > > a bespoke HDFS client, especially given Hadoop's compatibility > > > promises and history. Answers for these questions would make me more > > > confident: > > > > > > 1) Where are we on getting the client-side changes to HDFS pushed back > > > upstream? > > > > > No progress yet... Here I want to tell a good story that HBase is already > > use it as default :) > > > > > > > > 2) How well do we detect when our FS is not HDFS and what does > > > fallback look like? > > > > > Just wrap FSDataOutputStream to make it act like an asynchronous > > output(call hflush in a separated thread). The performance is not good I > > think. > > > > > > > > 3) Will this mean altering the versions of Hadoop we label as > > > supported for HBase 2.y+? > > > > > I have tested with hadoop versions from 2.4.x to 2.7.x, so I don't think > we > > need to change the supported versions? > > > > > > > > 4) How are we going to ensure our client remains compatible with newer > > > Hadoop releases? > > > > > We can not ensure, HDFS always breaks HBase at a new release... > > I need to test AsyncFSWAL on every new 2.x release and make it compatible > > with that version. And back to #1, I think we should make sure that the > > AsyncFSOutput will be in HDFS-3.0. And in HBase-3.0, we can introduce a > new > > 'AsyncFSWAL' that use the AsyncFSOutput in HDFS. > > > > > > > > On Thu, Apr 28, 2016 at 9:42 PM, Duo Zhang <zhang...@apache.org> > wrote: > > > > Six month after I filed HBASE-14790... > > > > > > > > Now the AsyncFSWAL is ready. The WALPE result shows that it is > > > *1.4x~3.7x* > > > > faster than FSHLog. The ITBLL result turns out that it is *not bad* > > than > > > > FSHLog(the master branch is not that stable itself...). > > > > > > > > More details can be found on HBASE-15536. > > > > > > > > So here we propose to change the default WAL from FSHLog to > AsyncFSWAL. > > > > Suggestions are welcomed. > > > > > > > > Thanks. > > > > > > > > > > > > -- > > > busbey > > > > > >