[ 
https://issues.apache.org/jira/browse/HBASE-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11907:
-----------------------------------
    Attachment: HBASE-11907.patch

Updated patch that addresses a potential source of confusion. Does not change 
any reviewed functionality. Here is the delta:
{code}
@@ -320,8 +320,8 @@ public class RegexStringComparator extends 
ByteArrayComparable {
    * This engine operates on byte arrays directly so is expected to be more GC
    * friendly, and reportedly is twice as fast as Java's Pattern engine.
    * <p>
-   * NOTE: Only the {@link Pattern} flags CASE_INSENSITIVE and DOTALL are
-   * supported.
+   * NOTE: Only the {@link Pattern} flags CASE_INSENSITIVE, DOTALL, and
+   * MULTILINE are supported.
    */
   static class JoniRegexEngine implements Engine {
     private Encoding encoding = UTF8Encoding.INSTANCE;
@@ -379,8 +379,15 @@ public class RegexStringComparator extends 
ByteArrayComparable {
         newFlags |= Option.IGNORECASE;
       }
       if ((flags & Pattern.DOTALL) != 0) {
+        // This does NOT mean Pattern.MULTILINE
         newFlags |= Option.MULTILINE;
       }
+      if ((flags & Pattern.MULTILINE) != 0) {
+        // This is what Java 8's Nashorn engine does when using joni and
+        // translating Pattern's MULTILINE flag
+        newFlags &= ~Option.SINGLELINE;
+        newFlags |= Option.NEGATE_SINGLELINE;
+      }
       return newFlags;
     }
 
@@ -389,9 +396,14 @@ public class RegexStringComparator extends 
ByteArrayComparable {
       if ((flags & Option.IGNORECASE) != 0) {
         newFlags |= Pattern.CASE_INSENSITIVE;
       }
+      // This does NOT mean Pattern.MULTILINE, this is equivalent to 
Pattern.DOTALL
       if ((flags & Option.MULTILINE) != 0) {
         newFlags |= Pattern.DOTALL;
       }
+      // This means Pattern.MULTILINE. Nice
+      if ((flags & Option.NEGATE_SINGLELINE) != 0) {
+        newFlags |= Pattern.MULTILINE;
+      }
       return newFlags;
     }
{code}

Going to commit later tonight.

> Use the joni byte[] regex engine in place of j.u.regex in 
> RegexStringComparator
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-11907
>                 URL: https://issues.apache.org/jira/browse/HBASE-11907
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 2.0.0, 0.98.7, 0.99.1
>
>         Attachments: HBASE-11907.patch, HBASE-11907.patch, HBASE-11907.patch, 
> HBASE-11907.patch
>
>
> The joni regex engine (https://github.com/jruby/joni), a Java port of 
> Oniguruma regexp library done by the JRuby project, is:
> - MIT licensed
> - Designed to work with byte[] arguments instead of String
> - Capable of handling UTF8 encoding
> - Regex syntax compatible
> - Interruptible
> - *About twice as fast as j.u.regex*
> - Has JRuby's jcodings library as a dependency, also MIT licensed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to