Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/20795#discussion_r176292180
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -1192,11 +1195,23 @@ class Analyzer(
* @see https://issues.apache.org/jira/browse/SPARK-19737
*/
object LookupFunctions extends Rule[LogicalPlan] {
- override def apply(plan: LogicalPlan): LogicalPlan =
plan.transformAllExpressions {
- case f: UnresolvedFunction if !catalog.functionExists(f.name) =>
- withPosition(f) {
- throw new
NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName)
- }
+ override def apply(plan: LogicalPlan): LogicalPlan = {
+ val catalogFunctionNameSet = new
mutable.HashSet[FunctionIdentifier]()
+ plan.transformAllExpressions {
+ case f: UnresolvedFunction if
catalogFunctionNameSet.contains(f.name) => f
+ case f: UnresolvedFunction if catalog.functionExists(f.name) =>
+ catalogFunctionNameSet.add(normalizeFuncName(f.name))
+ f
+ case f: UnresolvedFunction =>
+ withPosition(f) {
+ throw new
NoSuchFunctionException(f.name.database.getOrElse("default"),
+ f.name.funcName)
+ }
+ }
+ }
+
+ private def normalizeFuncName(name: FunctionIdentifier):
FunctionIdentifier = {
+ FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT),
name.database)
--- End diff --
For built-in functions, it may no a big deal if we don't find it in this
cache. It should be very fast to query built-in functions. I remember the main
issue of this ticket is external function lookup where it means more loading on
connection with metastore.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]