kevin yu commented on SPARK-23486:

Hello [~lian cheng]: I think the easy way is to build a hash map around the 
LookupFunctions, if the function exists in the external catalog,  put into the 
hash map for the first time, next time when call the LookupFunctions, first 
check the hash map to avoid the metastore accesses, does this approach look ok 
to you? If you think it is ok, I can provide a pr for reviewing. Thanks.

Another approach is to cache the external catalog functions in the share state, 
 many queries can use, but it will be more involved to do the invalidation.  

> LookupFunctions should not check the same function name more than once
> ----------------------------------------------------------------------
>                 Key: SPARK-23486
>                 URL: https://issues.apache.org/jira/browse/SPARK-23486
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1, 2.3.0
>            Reporter: Cheng Lian
>            Priority: Major
>              Labels: starter
> For a query invoking the same function multiple times, the current 
> {{LookupFunctions}} rule performs a check for each invocation. For users 
> using Hive metastore as external catalog, this issues unnecessary metastore 
> accesses and can slow down the analysis phase quite a bit.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to