[
https://issues.apache.org/jira/browse/FLINK-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16966495#comment-16966495
]
jackylau commented on FLINK-14243:
----------------------------------
Hi [~jark], it is not thread safe, if we call this hive udf from flink.
And We found problems when we use this udf and found memory leak.
you can see it from the picture below and this is by mat (eclipse memory
analyzer) tool
!Snipaste_2019-10-30_15-34-09.png!
> flink hiveudf needs some check when it is using cache
> -----------------------------------------------------
>
> Key: FLINK-14243
> URL: https://issues.apache.org/jira/browse/FLINK-14243
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Hive, Table SQL / Planner
> Affects Versions: 1.9.0
> Reporter: jackylau
> Priority: Major
> Fix For: 1.10.0
>
> Attachments: Snipaste_2019-10-30_15-34-09.png
>
>
> Flink1.9 brings in hive connector, but it will have some problem when the
> original hive udf using cache. We konw that hive is processed level parallel
> based on jvm, while flink/spark is task level parallel. If flink just calls
> the hive udf, it wll exists thread-safe problem when using cache.
> So it may need check the hive udf code and if it is not thread-safe, and set
> the flink parallize=1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)