[ 
https://issues.apache.org/jira/browse/TRAFODION-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345283#comment-16345283
 ] 

ASF GitHub Bot commented on TRAFODION-2912:
-------------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1420


> Non-deterministic scalar UDFs not executed once per row
> -------------------------------------------------------
>
>                 Key: TRAFODION-2912
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2912
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 2.0-incubating
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>            Priority: Major
>             Fix For: 2.3
>
>
> This problem was found by Andy Yang.
> Andy created a random generator scalar UDF and found that it did not return a 
> different random value for each row:
> {noformat}
> >>select scalar_rand_udf(), scalar_rand_udf()
> +>from (values (1), (2), (3)) T(s);
> RND RND 
> ----------- -----------
> 846930886 1804289383
> 846930886 1804289383
> 846930886 1804289383
> --- 3 row(s) selected.
> >>
> {noformat}
> Here is the explain, it shows that we are using hash joins, not nested joins, 
> to evaluate the UDFs:
> {noformat}
> >>explain options 'f' s;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 5    .    6    root                                                  3.00E+000
> 4    1    5    hybrid_hash_join                                      3.00E+000
> 3    2    4    hybrid_hash_join                                      1.00E+000
> .    .    3    isolated_scalar_udf             SCALAR_RAND_UDF       1.00E+000
> .    .    2    isolated_scalar_udf             SCALAR_RAND_UDF       1.00E+000
> .    .    1    tuplelist                                             3.00E+000
> --- SQL operation complete.
> >>
> {noformat}
> The problem is that we don't check for non-deterministic UDFs when we 
> transform a TSJ to a regular join in the transformer or normalizer. We don't 
> even set the non-deterministic flag in the group attributes of the 
> IsolatedScalarUDF node.
> The fix is to set this flag correctly and to add a check and not transform 
> routine joins for non-deterministic isolated scalar UDFs into a regular join.
> To recreate:
> Here is the source code of the UDF:
> {noformat}
> #include "sqludr.h"
> #include <stdlib.h>
> SQLUDR_LIBFUNC SQLUDR_INT32 scalar_rand_udf(SQLUDR_INT32 *out1,
>                                             SQLUDR_INT16 *outInd1,
>                                             SQLUDR_TRAIL_ARGS)
> {
>   if (calltype == SQLUDR_CALLTYPE_FINAL)
>     return SQLUDR_SUCCESS;
>   (*out1) = rand();
>   return SQLUDR_SUCCESS;
> }
> {noformat}
> Compile the UDF:
> {noformat}
> gcc -g -Wall -I$TRAF_HOME/export/include/sql -shared -fPIC -o 
> scalar_rand_udf.so scalar_rand_udf.c
> {noformat}
> Create the UDF and run it:
> {noformat}
> drop function scalar_rand_udf;
> drop library scalar_rand_udf_lib;
> create library scalar_rand_udf_lib
>  file '/home/zellerh/src/scalar_rand_udf/scalar_rand_udf.so';
> create function scalar_rand_udf() returns (rnd int)
>   external name 'scalar_rand_udf' library scalar_rand_udf_lib
>   not deterministic no sql no transaction required;
> prepare s from
> select scalar_rand_udf(), scalar_rand_udf()
> from (values (1), (2), (3)) T(s);
> explain options 'f' s;
> execute s;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to