- Created once when registering the function to the FunctionRegistry.

- The UDF is copied from the version in the registry during query compilation

- The query plan is serialized, then deserialized by the tasks during query 
execution, which constructs another instance of the UDF.



________________________________
From: Anirudh Paramshetti <anirudh2...@gmail.com>
Sent: Tuesday, February 02, 2016 6:29 AM
To: user@hive.apache.org
Subject: GenericUDF

Hi,

I have written a custom UDF in Java extending the GenericUDF class. I have some 
print statements in the constructor and initialize method, as to understand the 
number of calls made to them. From what I have read about GenericUDF, I was 
expecting the constructor and initialize method to be called once per UDF 
instance. But what I found out was, the constructor was called thrice(once 
while creating the temporary function and twice while using it in the hive 
query) and the initialize method was called twice(while using it in the hive 
query).

UDF output:

hive> create temporary function replace as 
'package.name.GenericNullReplacement';
Inside constructor of GenericNullReplacement

hive> select replace(column_name, 0.01) from dummy_table;
Inside constructor of GenericNullReplacement
Inside constructor of GenericNullReplacement
Inside initialize() method of GenericNullReplacement
Inside initialize() method of GenericNullReplacement
1.23
4.56
4.56
0.01
4.56
9.56

It would be great if someone could explain me what is happening here?


Thanks and Regards,
Anirudh Paramshetti

Reply via email to