[ https://issues.apache.org/jira/browse/HIVE-19889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514258#comment-16514258 ]
Hive QA commented on HIVE-19889: -------------------------------- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 0s{color} | {color:blue} ql in master has 2276 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-11807/dev-support/hive-personality.sh | | git revision | master / 5a9a328 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-11807/yetus.txt | | Powered by | Apache Yetus http://yetus.apache.org | This message was automatically generated. > Wrong results due to PPD of non deterministic functions with CBO > ---------------------------------------------------------------- > > Key: HIVE-19889 > URL: https://issues.apache.org/jira/browse/HIVE-19889 > Project: Hive > Issue Type: Bug > Reporter: Janaki Lahorani > Assignee: Janaki Lahorani > Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19889.1.patch > > > The following query can give wrong results when CBO is on: > {code} > select * from ( > select part1,randum123 > from (SELECT *, cast(rand() as double) AS randum123 FROM testA where > part1='CA' and part2 = 'ABC') a > where randum123 <= 0.5) s where s.randum123 > 0.25 limit 20; > The plan of the query is as follows: > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: testa > Statistics: Num rows: 2 Data size: 4580 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: ((rand() <= 0.5D) and (rand() > 0.25D)) (type: > boolean) > Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: 'CA' (type: string), rand() (type: double) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE > Column stats: NONE > Limit > Number of rows: 20 > Statistics: Num rows: 1 Data size: 2290 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 2290 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: 20 > Processor Tree: > ListSink > {code} > The relevant part in the plan is the filter: > {code} > Filter Operator > predicate: ((rand() <= 0.5D) and (rand() > 0.25D)) (type: > boolean) > {code} > The predicates s.randum123 > 0.25 and s.randum123 > 0.25 were pushed down. > And randum123 was resolved to rand(). This is bad because it will result in > invocation of rand() two times and rand() UDF is non-deterministic. Both the > rand calls can generate values that can satisfy the predicates independently, > but not together, whereas the original intention of the query is to give > results when rand falls between 0.25 and 0.5. > A sample result: > {code} > CA 0.9191984370369802 > CA 0.397933021566812 > {code} > where the condition was not satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005)