[ 
https://issues.apache.org/jira/browse/TRAFODION-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750363#comment-15750363
 ] 

Hans Zeller commented on TRAFODION-2400:
----------------------------------------

The problem happens in preCodeGen. We decide on the data types to present to 
the UDR writer early on, but preCodeGen may change an expression like a column 
reference to the char(20) column userid to a char(10) constant 'super-user'. 
Right now, what happens is that the executor presents a record in the format 
determined by preCodeGen (using char(10) in this case) to the UDF, which 
assumes the original record format with a char(20). This causes the UDF to read 
corrupted data.

The fix is to add a cast to the original UDR data type in preCodeGen if needed. 
We don't want to change the data types used in the UDR during preCodeGen, since 
the UDR writer may rely on the types staying constant throughout the 
compilation and execution phases.

> Incorrect data returned by TMUDF with selection predicate on input table
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-2400
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2400
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>             Fix For: 2.0-incubating
>
>
> We saw incorrect results from a query with the following characteristics:
>   - the incorrect data is read from the input table
>   - there is an equals predicate col=const on the input table
>   - the constant const has a data type that is smaller than the column (e.g. 
> comparing an int to a constant 1000 which is a smallint or comparing a 
> char(20) column to a char(1) constant 'x'.
> To demonstrate the issue, I added the following to regression test 
> udr/TEST001:
> {noformat}
> SELECT cast(CONVERTTIMESTAMP(ts) as TIME(6)), userid, session_id, ipAddr
> FROM UDF(sessionize_dynamic(TABLE(SELECT userid,
>                                          JULIANTIMESTAMP(ts) as TS,
>                                          ipAddr
>                                   FROM clicks
>                                   WHERE userid='super-user'
>                                   PARTITION BY 1 ORDER BY 2),
>                             'USERID',
>                             'TS',
>                             60000000));
> SELECT cast(CONVERTTIMESTAMP(ts) as TIME(6)), userid, session_id, ipAddr
> FROM UDF(sessionize_dynamic(TABLE(SELECT userid,
>                                          JULIANTIMESTAMP(ts) as TS,
>                                          ipAddr
>                                   FROM clicks
>                                   WHERE userid='super-user'
>                                   PARTITION BY 1 ORDER BY 2),
>                             'USERID',
>                             'TS',
>                             60000000));
> {noformat}
> For some reason I had to do the same select twice, the first one didn't show 
> a corrupted userid and/or ipAddr field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to