Naresh P R created HIVE-26047:
---------------------------------
Summary: Vectorized LIKE UDF should use Re2J regex to address
JDK-8203458
Key: HIVE-26047
URL: https://issues.apache.org/jira/browse/HIVE-26047
Project: Hive
Issue Type: Bug
Reporter: Naresh P R
Assignee: Naresh P R
Below pattern is taking a long time to validate regex in java8 with same trace
as shown in java bug
[[JDK-8203458||https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]
[https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]
[]|https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458]
import java.util.regex.Pattern;
public class ABCD {
public static void main(String args[]) {
String pattern =
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";
Pattern CHAIN_PATTERN = Pattern.compile("(%?[^%_\\\\]+%?)+");
CHAIN_PATTERN.matcher(pattern).matches();
}
}
Same is reproducible with following SQL
{code:java}
create table table1(name string);
insert into table1 (name) values
('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b');
select * from table1 where name like
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa_b";{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)