[ https://issues.apache.org/jira/browse/SPARK-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696561#comment-14696561 ]
Yadong Qi edited comment on SPARK-9213 at 8/14/15 6:34 AM: ----------------------------------------------------------- [~rxin] Yes, I will do it as below: ``` case class Like(left: Expression, right: Expression) { if (flag) { JavaLike(left, right) } else { JoniLike(left, right) } } case class JavaLike(left: Expression, right: Expression) case class JoniLike(left: Expression, right: Expression) ``` Right? was (Author: waterman): [~rxin] Yes, I will do it as below: ``` case class Like(left: Expression, right: Expression) { if (flag) { JavaLike(left, right) } else { JoniLike(left, right) } } case class JavaLike(left: Expression, right: Expression) case class JoniLike(left: Expression, right: Expression) ``` Right? > Improve regular expression performance (via joni) > ------------------------------------------------- > > Key: SPARK-9213 > URL: https://issues.apache.org/jira/browse/SPARK-9213 > Project: Spark > Issue Type: Umbrella > Components: SQL > Reporter: Reynold Xin > > I'm creating an umbrella ticket to improve regular expression performance for > string expressions. Right now our use of regular expressions is inefficient > for two reasons: > 1. Java regex in general is slow. > 2. We have to convert everything from UTF8 encoded bytes into Java String, > and then run regex on it, and then convert it back. > There are libraries in Java that provide regex support directly on UTF8 > encoded bytes. One prominent example is joni, used in JRuby. > Note: all regex functions are in > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org