beliefer edited a comment on issue #27507: [SPARK-24884][SQL] Support regexp 
function regexp_extract_all
URL: https://github.com/apache/spark/pull/27507#issuecomment-585518801
 
 
   > I have a high-level question. Do we have huge advantage to generate Java 
code?
   > 
   > One advantage is to store the result of `Pattern.compile()` into each 
global variable for caching while the non-generated code shares one variable 
for cache.
   > On the other hand, the size of the result is not small. Which trade-off do 
we select? Space or performance?
   
   LIKE and RLIKE cache the result of `Pattern.compile()`.
   RegExpReplace and RegExpExtract use another way 
https://github.com/apache/spark/blob/5e3c092dc055ca0f1a2f523efa5f305555b991e6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L438.
   If the pattern string is a constant, the two approaches to the same goal.
   If the pattern string is a variable, the performance issue seems cannot to 
avoid.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to