[ 
https://issues.apache.org/jira/browse/FLINK-21738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300058#comment-17300058
 ] 

zoucao commented on FLINK-21738:
--------------------------------

hi [~qingyue] and [~jark], thanks for your quick reply, I agree with Jark's 
opinion, cache should be used in Module intead of ModuleManager. At present, 
core module and hive module can use `Set<String>` to cache function name. If 
cache is used in module, there is no worry about the loading order, what do you 
think, Jane. By the way, iterate until the first function found is another good 
point.

At last, I have an another confusion about function loading. I hope you could 
help me. If we use a function many times in one DML 
{quote}select
 cast(get_json_object(message, '$.did') as VARCHAR) as server_app_did,
 get_json_object(message, '$.model') as prod_model,
 get_json_object(message, '$.current') as current_version,
 get_json_object(message, '$.ret.code') as rpc_code,
 max(get_json_object(message, '$.time')) as upgrade_time,
 get_json_object(message, '$.version') as to_version
from  test_table;
{quote}
in this DML,we use hive func `get_json_object` 6 times,so method 
FunctionCatalog#lookupFunction will be called 6 times, right?  Since,we only 
support one func which is called get_json_object, the first method call can get 
the result(FunctionLookup.Result), why do we call the next 5 times? or are 
there any optimizations or concern i don't find?

> reduce unnecessary method calls  in ModuleManager
> -------------------------------------------------
>
>                 Key: FLINK-21738
>                 URL: https://issues.apache.org/jira/browse/FLINK-21738
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / API
>            Reporter: zoucao
>            Priority: Major
>
> In flink sql, if we use many functions(hive func or flink built-in func), 
> Flink will call method
> `getFunctionDefinition` in 
> [ModuleManager|https://github.com/apache/flink/blob/97bfd049951f8d52a2e0aed14265074c4255ead0/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/module/ModuleManager.java#L44]
>  many times to load func and each module's method `listFunctions` will be 
> called at the same time. I think the same result will be returned for one 
> module, so maybe a cache should be used here to reduce time waste.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to