[jira] [Comment Edited] (HBASE-18872) Backup scaling for multiple table and millions of row
[ https://issues.apache.org/jira/browse/HBASE-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17268322#comment-17268322 ] Mallikarjun edited comment on HBASE-18872 at 1/20/21, 2:13 AM: --- Moving this from B testing to Backup/Restore Phase 4. [~vrodionov] where is it is more suitable. [~vishk] FYI was (Author: rda3mon): Moving this from B testing to Backup/Restore Phase 4. [~vrodionov] where is it is more suitable. [~vishk] FYI * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13104721] > Backup scaling for multiple table and millions of row > - > > Key: HBASE-18872 > URL: https://issues.apache.org/jira/browse/HBASE-18872 > Project: HBase > Issue Type: Improvement >Reporter: Vishal Khandelwal >Assignee: Vladimir Rodionov >Priority: Major > > I did a simple experiment of loading ~200 million rows on a table 1 and > nothing in a table 2. This test was done on a local cluster ~ approx 3-4 > containers were running in parallel. The focus of the test was not on how > much time backup takes but on time spent on the table were no data has been > changed. > *Table without Data -->* > Elapsed: 44mins, 52sec > Average Map Time 3sec > Average Shuffle Time 2mins, 35sec > Average Merge Time0sec > Average Reduce Time 0sec > Map : 2052 > Reduce : 1 > *Table with Data -->* > Elapsed: 1hrs, 44mins, 10sec > Average Map Time 4sec > Average Shuffle Time 37sec > Average Merge Time3sec > Average Reduce Time 47sec > Map : 2052 > Reduce : 64 > All above numbers are a single node cluster so not many mappers run in > parallel. but let's extrapolate this to 20 node cluster, with ~100 tables and > data size to be backed up various for approx 2000 Wals, let us say each 20 > node can process 3 containers i.e 60 wals in parallel. assume 3 sec are spent > in each WALs i.e. 6000\ 60 sec --> 100 per table --> 1 sec for all > tables. > ~166 mins --> ~2.7 hrs only for filtering. This does not seem to be scale. > (These are just rough numbers from a basic test). As all parsing is O (m > (WALS) * n (Tables)) > Main intend of this test is to see even the backup of very less churning > table might take good amount for just filtering the data. As number of table > or data increases, this does not seem scalable > Even i can see from our current cluster numbers easily close to 100 table, > 200 millions rows, 200 -300 GB. > I would suggest that we should have filtering to parse WALs once and to > segregate in multiple WALs per table --> hFiles from per table wals. ( just a > rough idea). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-18872) Backup scaling for multiple table and millions of row
[ https://issues.apache.org/jira/browse/HBASE-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17268313#comment-17268313 ] Mallikarjun edited comment on HBASE-18872 at 1/20/21, 1:58 AM: --- [~vishk] Your experiment made use of incremental backup or full backup? Based on my experiment I might have totally different numbers. Full Backup does a snapshot copy --> This should not have dependence on WAL files. Incremental backup does generate hfiles from wal and does a Dist cp. --> This should be relatively smaller in size and need not scale to the extent full backup does. was (Author: rda3mon): [~vishk] Your experiment made use of incremental backup or full backup? Based on my experiment I might have totally different numbers. > Backup scaling for multiple table and millions of row > - > > Key: HBASE-18872 > URL: https://issues.apache.org/jira/browse/HBASE-18872 > Project: HBase > Issue Type: Improvement >Reporter: Vishal Khandelwal >Assignee: Vladimir Rodionov >Priority: Major > > I did a simple experiment of loading ~200 million rows on a table 1 and > nothing in a table 2. This test was done on a local cluster ~ approx 3-4 > containers were running in parallel. The focus of the test was not on how > much time backup takes but on time spent on the table were no data has been > changed. > *Table without Data -->* > Elapsed: 44mins, 52sec > Average Map Time 3sec > Average Shuffle Time 2mins, 35sec > Average Merge Time0sec > Average Reduce Time 0sec > Map : 2052 > Reduce : 1 > *Table with Data -->* > Elapsed: 1hrs, 44mins, 10sec > Average Map Time 4sec > Average Shuffle Time 37sec > Average Merge Time3sec > Average Reduce Time 47sec > Map : 2052 > Reduce : 64 > All above numbers are a single node cluster so not many mappers run in > parallel. but let's extrapolate this to 20 node cluster, with ~100 tables and > data size to be backed up various for approx 2000 Wals, let us say each 20 > node can process 3 containers i.e 60 wals in parallel. assume 3 sec are spent > in each WALs i.e. 6000\ 60 sec --> 100 per table --> 1 sec for all > tables. > ~166 mins --> ~2.7 hrs only for filtering. This does not seem to be scale. > (These are just rough numbers from a basic test). As all parsing is O (m > (WALS) * n (Tables)) > Main intend of this test is to see even the backup of very less churning > table might take good amount for just filtering the data. As number of table > or data increases, this does not seem scalable > Even i can see from our current cluster numbers easily close to 100 table, > 200 millions rows, 200 -300 GB. > I would suggest that we should have filtering to parse WALs once and to > segregate in multiple WALs per table --> hFiles from per table wals. ( just a > rough idea). -- This message was sent by Atlassian Jira (v8.3.4#803005)