[
https://issues.apache.org/jira/browse/FLINK-21738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300058#comment-17300058
]
zoucao commented on FLINK-21738:
--------------------------------
hi [~qingyue] and [~jark], thanks for your quick reply, I agree with Jark's
opinion, cache should be used in Module intead of ModuleManager. At present,
core module and hive module can use `Set<String>` to cache function name. If
cache is used in module, there is no worry about the loading order, what do you
think, Jane. By the way, iterate until the first function found is another good
point.
At last, I have an another confusion about function loading. I hope you could
help me. If we use a function many times in one DML
{quote}select
cast(get_json_object(message, '$.did') as VARCHAR) as server_app_did,
get_json_object(message, '$.model') as prod_model,
get_json_object(message, '$.current') as current_version,
get_json_object(message, '$.ret.code') as rpc_code,
max(get_json_object(message, '$.time')) as upgrade_time,
get_json_object(message, '$.version') as to_version
from test_table;
{quote}
in this DML,we use hive func `get_json_object` 6 times,so method
FunctionCatalog#lookupFunction will be called 6 times, right? Since,we only
support one func which is called get_json_object, the first method call can get
the result(FunctionLookup.Result), why do we call the next 5 times? or are
there any optimizations or concern i don't find?
> reduce unnecessary method calls in ModuleManager
> -------------------------------------------------
>
> Key: FLINK-21738
> URL: https://issues.apache.org/jira/browse/FLINK-21738
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / API
> Reporter: zoucao
> Priority: Major
>
> In flink sql, if we use many functions(hive func or flink built-in func),
> Flink will call method
> `getFunctionDefinition` in
> [ModuleManager|https://github.com/apache/flink/blob/97bfd049951f8d52a2e0aed14265074c4255ead0/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/module/ModuleManager.java#L44]
> many times to load func and each module's method `listFunctions` will be
> called at the same time. I think the same result will be returned for one
> module, so maybe a cache should be used here to reduce time waste.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)