Rushabh Shah created PHOENIX-5774:
-------------------------------------
Summary: Phoenix Mapreduce job over hbase Snapshots is extremely
inefficient.
Key: PHOENIX-5774
URL: https://issues.apache.org/jira/browse/PHOENIX-5774
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.13.1
Reporter: Rushabh Shah
Internally we have tenant estimation framework which calculates the number of
rows each tenant occupy in the cluster. Basically what the framework does is it
launch MapReduce(MR) job per table and run the following query : "Select
tenant_id from <table-name>" and we do count over this tenant_id in reducer
phase.
Earlier we use to run this query against live table but we found meta table was
getting hammered over the time this job was running so we thought to run the MR
job on hbase snapshots instead of live table. Take advantage of this feature:
https://issues.apache.org/jira/browse/PHOENIX-3744
When we were querying live table, the MR job for one of the biggest table in
sandbox cluster took around 2.5 hours.
After we started using hbase snapshots, the MR job for the same table took 135
hours. We have maximum concurrent running mapper limit to 15 to avoid
hammering meta table when we were querying live tables. We didn't remove that
restriction after we moved to hbase snapshots.So ideally it shouldn't take 135
hours to complete if we don't have that restriction.
Some statistics about that table:
Size: 578 GB, Num Regions in that table: 161
The average map time took 3 mins 11 seconds when querying live table.
The average map time took 5 hours 33 minutes when querying hbase snapshots.
The issue is we don't consider snapshots while generating splits. So during map
phase, each map task has to go through all regions in snapshots to determine
which region has the start and end key assigned to that task. After determining
all regions, it has to open each region to scan all hfiles in that region. In
one such map task, the start and end key from split was distributed among 289
regions. Reading from each region took an average of 90 seconds, so for 289
regions it took approximately 7 hours.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)