mwkang commented on code in PR #4720:
URL: https://github.com/apache/hbase/pull/4720#discussion_r972614431


##########
hbase-client/src/main/java/org/apache/hadoop/hbase/filter/TimeoutCharSequence.java:
##########
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.filter;
+
+import org.apache.hadoop.hbase.DoNotRetryUncheckedIOException;
+import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
+import org.apache.hadoop.util.StringUtils;
+
+/**
+ * It checks whether the timeout has been exceeded whenever the charAt method 
is called.
+ */
+class TimeoutCharSequence implements CharSequence {
+  private final CharSequence value;
+  private final long startMillis;
+  private final long timeoutMillis;
+
+  /**
+   * Initialize a TimeoutCharSequence.
+   * @param value         the original data
+   * @param startMillis   time the operation started (ms)
+   * @param timeoutMillis the timeout (ms)
+   */
+  TimeoutCharSequence(CharSequence value, long startMillis, long 
timeoutMillis) {
+    this.value = value;
+    this.startMillis = startMillis;
+    this.timeoutMillis = timeoutMillis;
+  }
+
+  @Override
+  public int length() {
+    return value.length();
+  }
+
+  @Override
+  public char charAt(int index) {
+    final long diff = EnvironmentEdgeManager.currentTime() - startMillis;

Review Comment:
   @Apache9
   Thank you for your review.
   
   I've heard that ReDoS is caused by backtracking.
   Therefore, I considered adding RE2J as a new engine to prevent ReDoS.
   (RE2J does not use backtracking.)
   However, in the case of RE2J, the syntax and results were slightly different 
from the existing engines.
   I couldn't have confidence if it was okay to add a RE2J engine.
   so I looked for an alternative other than adding a RE2J engine.
   
   For the Java Regex engine, I founded that the engine call charAt method 
every time.
   Java Regex engine does not provide a timeout feature, but it seemed that it 
could be implemented (anomalously) with charAt.
   I considered checking the timeout with additional Threads.
   However, I felt that running additional Threads was a waste of resources.
   
   When monitored the internal cluster, no large overhead was found due to 
System.currentTimeMillis calls in the pattern matching operation.
   (There may be some overhead, of course.)
   
   By default, TimeoutCharSequence is not used, so most HBase users are 
unlikely to be affected by System.currentTimeMillis overhead.
   I wanted to provide a way to solve the ReDoS.
   
   If there is another good way, I would like to get some advice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to