[
https://issues.apache.org/jira/browse/HBASE-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell updated HBASE-11907:
-----------------------------------
Attachment: HBASE-11907.patch
Updated patch that addresses a potential source of confusion. Does not change
any reviewed functionality. Here is the delta:
{code}
@@ -320,8 +320,8 @@ public class RegexStringComparator extends
ByteArrayComparable {
* This engine operates on byte arrays directly so is expected to be more GC
* friendly, and reportedly is twice as fast as Java's Pattern engine.
* <p>
- * NOTE: Only the {@link Pattern} flags CASE_INSENSITIVE and DOTALL are
- * supported.
+ * NOTE: Only the {@link Pattern} flags CASE_INSENSITIVE, DOTALL, and
+ * MULTILINE are supported.
*/
static class JoniRegexEngine implements Engine {
private Encoding encoding = UTF8Encoding.INSTANCE;
@@ -379,8 +379,15 @@ public class RegexStringComparator extends
ByteArrayComparable {
newFlags |= Option.IGNORECASE;
}
if ((flags & Pattern.DOTALL) != 0) {
+ // This does NOT mean Pattern.MULTILINE
newFlags |= Option.MULTILINE;
}
+ if ((flags & Pattern.MULTILINE) != 0) {
+ // This is what Java 8's Nashorn engine does when using joni and
+ // translating Pattern's MULTILINE flag
+ newFlags &= ~Option.SINGLELINE;
+ newFlags |= Option.NEGATE_SINGLELINE;
+ }
return newFlags;
}
@@ -389,9 +396,14 @@ public class RegexStringComparator extends
ByteArrayComparable {
if ((flags & Option.IGNORECASE) != 0) {
newFlags |= Pattern.CASE_INSENSITIVE;
}
+ // This does NOT mean Pattern.MULTILINE, this is equivalent to
Pattern.DOTALL
if ((flags & Option.MULTILINE) != 0) {
newFlags |= Pattern.DOTALL;
}
+ // This means Pattern.MULTILINE. Nice
+ if ((flags & Option.NEGATE_SINGLELINE) != 0) {
+ newFlags |= Pattern.MULTILINE;
+ }
return newFlags;
}
{code}
Going to commit later tonight.
> Use the joni byte[] regex engine in place of j.u.regex in
> RegexStringComparator
> -------------------------------------------------------------------------------
>
> Key: HBASE-11907
> URL: https://issues.apache.org/jira/browse/HBASE-11907
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Purtell
> Assignee: Andrew Purtell
> Priority: Minor
> Fix For: 2.0.0, 0.98.7, 0.99.1
>
> Attachments: HBASE-11907.patch, HBASE-11907.patch, HBASE-11907.patch,
> HBASE-11907.patch
>
>
> The joni regex engine (https://github.com/jruby/joni), a Java port of
> Oniguruma regexp library done by the JRuby project, is:
> - MIT licensed
> - Designed to work with byte[] arguments instead of String
> - Capable of handling UTF8 encoding
> - Regex syntax compatible
> - Interruptible
> - *About twice as fast as j.u.regex*
> - Has JRuby's jcodings library as a dependency, also MIT licensed
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)