Volodymyr Kvych created PHOENIX-4958:
----------------------------------------

             Summary: Hbase does not load updated UDF class simultaneously on 
whole cluster
                 Key: PHOENIX-4958
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4958
             Project: Phoenix
          Issue Type: Bug
            Reporter: Volodymyr Kvych


To update UDF according to [https://phoenix.apache.org/udf.html] limitations, I 
do next steps:
 # Drop existing function and JAR file:
{code:sql}
DROP FUNCTION my_function;
DELETE JAR 'hdfs:/.../udf-v1.jar;{code}

 # Remove JAR file across cluster's local file system, like:
{code:java}
rm ${hbase.local.dir}/jars/udf-v1.jar{code}

 # Upload updated JAR file and create the same function:
{code:sql}
ADD JARS '/.../udf-v2.jar;
CREATE FUNCTION my_function(...) ... using jar 'hdfs:/.../udf-v2.jar';
{code}

The problem is, that every RegionServer could keep the previously loaded 
function undefined period of time until GC decides to collect appropriate 
DynamicClassLoader instance and old UDF class. As result, some RegionServer 
might execute new function's code, but others - the old one. There is no way to 
ensure that the function was reloaded by whole cluster.

As a proposed fix, I'd updated the UDFExpression to keep DynamicClassLoaders 
per-tenant and per-jar name. Since JAR name must be changed to correctly update 
the UDF, it's working for described use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to