Hi Huaxiang, Thanks for the confirmation.
Andor On Wed, 2022-04-06 at 21:59 +0000, Huaxiang Sun wrote: > Sorry for replying late. Yeah, at server side, since region move is > very rare in our production cluster, we did not enable the async wal > replication. It is only being tested in itbll cluster. > > On 2022/03/31 13:35:01 Andor Molnar wrote: > > One more question Huaxiang which makes me confused: > > > > You mentioned that "At server side, we use hfile refresh instead of > > wal replication.” which means that async wal replication for meta > > is not tested by you. > > > > Is that correct? > > > > Andor > > > > > > > > > > > On 2022. Mar 30., at 19:22, Andor Molnar <an...@apache.org> > > > wrote: > > > > > > Hi Huaxiang, > > > > > > Given that you already use this feature in production with no > > > problems, I created a short patch to remove the “be careful” > > > warnings from the HBase documentation. > > > > > > PTAL if you agree. > > > > > > https://github.com/apache/hbase/pull/4301 > > > > > > Thanks, > > > Andor > > > > > > > > > > > > > > > > On 2022. Mar 29., at 18:52, Huaxiang Sun > > > > <huaxiang...@apache.org> wrote: > > > > > > > > This is great, thanks for the testing results! > > > > > > > > Huaxiang > > > > > > > > On 2022/03/29 13:29:48 Andor Molnar wrote: > > > > > Works! > > > > > > > > > > I enabled async wal replication with the suggested option and > > > > > ITBLL ran successfully. > > > > > > > > > > generator step: > > > > > hbase > > > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList > > > > > generator 15 1000000 /tmp/hbase-itbll > > > > > > > > > > verification step: > > > > > hbase > > > > > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList > > > > > verify /tmp/hbase-itbll-verify 15 > > > > > > > > > > Both succeeded. I also confirmed that meta replicas are > > > > > written and read by the clients, so must be in load balance > > > > > mode. > > > > > > > > > > Thanks for the help! > > > > > > > > > > Andor > > > > > > > > > > > > > > > > > > > > > On 2022. Mar 27., at 6:59, Huaxiang Sun > > > > > > <huaxiang...@apache.org> wrote: > > > > > > > > > > > > It makes sense to turn on async wal replication when > > > > > > hbase.meta.replicas.use = true. Let me run couple rounds of > > > > > > itbll with hbase.region.replica.replication.catalog.enabled > > > > > > (lastest 2.4 and 2.5.0 candidates) to get more confidence > > > > > > before proposing turn on async wal replication for meta. > > > > > > > > > > > > Thanks, > > > > > > Huaxiang > > > > > > > > > > > > On 2022/03/26 04:03:15 Andrew Purtell wrote: > > > > > > > Just to be clear when I say "it seems pointless to have > > > > > > > meta replicas which > > > > > > > do not actually receive updates (by default)", what I > > > > > > > should have said is > > > > > > > 'timely updates', because a long delay in updating meta > > > > > > > might as well be a > > > > > > > missed update. > > > > > > > > > > > > > > On Fri, Mar 25, 2022 at 9:01 PM Andrew Purtell > > > > > > > <apurt...@apache.org> wrote: > > > > > > > > > > > > > > > > "Async WAL replication for META is added as a new > > > > > > > > > feature in 2.4.0. It > > > > > > > > is still under active development. Use with caution. > > > > > > > > Set > > > > > > > > hbase.region.replica.replication.catalog.enabled to > > > > > > > > enable async WAL > > > > > > > > Replication for META region replicas. It is off by > > > > > > > > default." > > > > > > > > > > > > > > > > Do we still need this warning? > > > > > > > > > > > > > > > > Should hbase.region.replica.replication.catalog.enabled > > > > > > > > have a default of > > > > > > > > 'true' (enabled) if hbase.meta.replicas.use = true ? > > > > > > > > Otherwise, it seems > > > > > > > > pointless to have meta replicas which do not actually > > > > > > > > receive updates (by > > > > > > > > default). > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 25, 2022 at 10:51 AM Huaxiang Sun > > > > > > > > <huaxiang...@apache.org> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Andor, > > > > > > > > > > > > > > > > > > I get what you are saying. The HFile refreshing is > > > > > > > > > the old way for > > > > > > > > > replica regions to refresh hfiles periodically, > > > > > > > > > default is 5 minutes. In > > > > > > > > > this itbll case, we need to have the wal replication > > > > > > > > > enabled for meta > > > > > > > > > replica. Please check out, > > > > > > > > > > > > > > > > > > https://hbase.apache.org/book.html#_async_wal_replication_for_meta_table_as_of_hbase_2_4_0 > > > > > > > > > . > > > > > > > > > Basically, you need to set > > > > > > > > > "hbase.region.replica.replication.catalog.enabled" to > > > > > > > > > true in the > > > > > > > > > configuration and rerun itbll. Otherwise, all meta > > > > > > > > > changes at the primary > > > > > > > > > meta region wont be updated at the replica meta > > > > > > > > > regions and it will result > > > > > > > > > in itbll failures. > > > > > > > > > > > > > > > > > > Hope this helps, > > > > > > > > > > > > > > > > > > Huaxiang > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2022/03/25 13:46:42 Andor Molnar wrote: > > > > > > > > > > Hi Huaxiang, > > > > > > > > > > > > > > > > > > > > We use 2.4.6 for the tests. > > > > > > > > > > > > > > > > > > > > I run itbll with the following command: > > > > > > > > > > > > > > > > > > > > hbase > > > > > > > > > > org.apache.hadoop.hbase.test.IntegrationTestBigLink > > > > > > > > > > edList > > > > > > > > > generator 15 1000000 /tmp/hbase-itbll > > > > > > > > > > > > > > > > > > > > for the generator step and essentially jobs have > > > > > > > > > > failed. We can see the > > > > > > > > > meta request are spanning out to replicas, but writes > > > > > > > > > start failing after > > > > > > > > > this due to the stale cache which is not getting > > > > > > > > > updated. > > > > > > > > > > > > > > > > > > > > Would you please tell me more about ‘hfile refresh’ > > > > > > > > > > and how to > > > > > > > > > configure it? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Andor > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2022. Mar 24., at 17:43, Huaxiang Sun > > > > > > > > > > > <huaxiang...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Hi Andor, > > > > > > > > > > > > > > > > > > > > > > Which 2.4 release do you test in your lab? We use > > > > > > > > > > > this feature at > > > > > > > > > production cluster with 2.4.5. > > > > > > > > > > > At server side, we use hfile refresh instead of > > > > > > > > > > > wal replication. I > > > > > > > > > used to run itbll for each release with this feature > > > > > > > > > enabled. How did you > > > > > > > > > find the errors, did itbll fail? > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Huaxiang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best regards, > > > > > > > > Andrew > > > > > > > > > > > > > > > > Unrest, ignorance distilled, nihilistic imbeciles - > > > > > > > > It's what we’ve earned > > > > > > > > Welcome, apocalypse, what’s taken you so long? > > > > > > > > Bring us the fitting end that we’ve been counting on > > > > > > > > - A23, Welcome, Apocalypse > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Best regards, > > > > > > > Andrew > > > > > > > > > > > > > > Unrest, ignorance distilled, nihilistic imbeciles - > > > > > > > It's what we’ve earned > > > > > > > Welcome, apocalypse, what’s taken you so long? > > > > > > > Bring us the fitting end that we’ve been counting on > > > > > > > - A23, Welcome, Apocalypse > > > > > > > > > > > > > > > > > > > > > > > >