Kim Jin Chul has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8355 )
Change subject: IMPALA-5754: Improve randomness of rand()/random() ...................................................................... IMPALA-5754: Improve randomness of rand()/random() Currently implementation of rand/random built-in functions use rand_r of C library. We recognized its randomness was poor. std::mt19937 in C++11 libarary shows better randomness than rand_r because it has much longer period than that of rand in C. (More details in http://www.pcg-random.org/) Here is the comparison between before and after: * Before > select count(distinct(rand(1))), count(*) from t1 +---------------------------+-----------+ | count(distinct (rand(1))) | count(*) | +---------------------------+-----------+ | 17053 | 103809024 | +---------------------------+-----------+ * After > select count(distinct(rand(1))), count(*) from t1 +---------------------------+-----------+ | count(distinct (rand(1))) | count(*) | +---------------------------+-----------+ | 34603008 | 103809024 | +---------------------------+-----------+ You may expect maximum randomness(e.g. 103809024). Due to the issue IMPALA-6117, randomness could be "maximum randomess / n". "n" means the number of Impala execution engines. n is 3 in this example and each execution engine loads and processes data in parallel. This change introduces a new utility code for random because we have a plan to replace the legacy in IMPALA-4954 with the utility code. Testing: rand-util-test is newly addded. It checks randomness, deterministic and range. Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684 --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/util/CMakeLists.txt A be/src/util/rand-util-test.cc A be/src/util/rand-util.cc A be/src/util/rand-util.h 6 files changed, 187 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8355/3 -- To view, visit http://gerrit.cloudera.org:8080/8355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684 Gerrit-Change-Number: 8355 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul <jinc...@gmail.com>