Recently while creating a custom generic hive UDF I came across 
a different behavior for the Evaluate method. The custom UDF had a logic to 
increment the counter and write it to a file. Now when I execute it directly 
without involving any table it always returns an extra count i.e. 2.
                Now when I added some logs to inside the evaluate method I 
observed that the logs (sysout) were printed twice. Now on further research I 
came across the @UDFType annotation and found out that if we do not provide 
this annotation in our custom UDF, default value is deterministic true.
                When I provide this annotation in my custom UDF and set 
@UDFType( deterministic = false ), I observed that my logs were printed only 
once and my UDF was returning the accurate count i.e. 1 therefore implying my 
evaluate was called only once when @UDFType( deterministic = false ).
                Now I wanted to understand what is the connection between 
@UDFType and Evaluate method when UDF is invoked directly without a table.

                Note : When I invoke my UDF on a table I get the appropriate 
count even with @UDFType( deterministic = true ).

                Thanks in advance. :)
PradeepKumar Yadav

Reply via email to