GitHub user eowhadi opened a pull request:
https://github.com/apache/incubator-trafodion/pull/395
[TRAFODION-1900] Optimize MDAM scans with small scanner
When doing MDAM scans, we are performing interlaced scan for PROBE and for
real scan. The probes always return only 1 row, then we close the scanner
immediately, therefore should use always small scanner. I will make it
conditional on the existing CQD HBASE_SMALL_SCANNER (ether SYSTEM or ON). In
addition, caching of blocks retrieved by probe should always we at least
receiving one succesfull cache hit on the next MDAM scan, therefore forcing
caching ON for MDAM prob is a good idea. Again, will make this forcing
conditional on HBASE_SMALL_SCANNER SYSTEM or ON.
Then for the real scan part of MDAM, I will use the following heuristic: If
previous scan fitted in one hbase block, then it is likelly than next will also
fit in one hbase block, therefore enable small scanner for next scan. Again all
this only if CQD above is ON or SYSTEM.
Also includes a fix where SMALL_SCANNER would be turned on for MDAM scan
because the compiler for MDAM is not polulating the expected number of rows
returned.
Results of using small scanner on MDAM when it make sense showed a 1.39X
speed improvement...
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/eowhadi/incubator-trafodion mdamSmallScanner
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-trafodion/pull/395.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #395
----
commit 96bcc40b98694609b1bf20ce465d128852c34a48
Author: Eric Owhadi <[email protected]>
Date: 2016-03-18T16:52:51Z
[TRAFODION-1900]
When doing MDAM scans, we are performing interlaced scan for PROBE and for
real scan. The probes always return only 1 row, then we close the scanner
immediately, therefore should use always small scanner. I will make it
conditional on the existing CQD HBASE_SMALL_SCANNER (ether SYSTEM or ON). In
addition, caching of blocks retrieved by probe should always we at least
receiving one succesfull cache hit on the next MDAM scan, therefore forcing
caching ON for MDAM prob is a good idea. Again, will make this forcing
conditional on HBASE_SMALL_SCANNER SYSTEM or ON.
Then for the real scan part of MDAM, I will use the following heuristic: If
previous scan fitted in one hbase block, then it is likelly than next will also
fit in one hbase block, therefore enable small scanner for next scan. Again all
this only if CQD above is ON or SYSTEM.
Also includes a fix where SMALL_SCANNER would be turned on for MDAM scan
because the compiler for MDAM is not polulating the expected number of rows
returned.
Results of using small scanner on MDAM when it make sense showed a 1.39X
speed improvement...
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---