GitHub user jcmcote opened a pull request:
https://github.com/apache/drill/pull/458
DRILL-4573: Zero copy LIKE, REGEXP_MATCHES, SUBSTR
All the functions using the java.util.regex.Matcher are currently creating
Java string objects to pass into the matcher.reset().
However this creates unnecessary copy of the bytes and a Java string object.
The matcher uses a CharSequence, so instead of making a copy we can create
an adapter from the DrillBuffer to the CharSequence interface.
Gains of 25% in execution speed are possible when going over VARCHAR of 36
chars. The gain will be proportional to the size of the VARCHAR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jcmcote/drill DRILL-4573
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/458.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #458
----
commit 71b35ecf5895fc8fbae1bf862cbb982787712ee2
Author: jean-claude cote <[email protected]>
Date: 2016-04-02T03:37:00Z
DRILL-4573: Zero copy LIKE, REGEXP_MATCHES, SUBSTR
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---