spark git commit: [SPARK-17854][SQL] rand/randn allows null/long as input seed
Repository: spark Updated Branches: refs/heads/master 23ce0d1e9 -> 340f09d10 [SPARK-17854][SQL] rand/randn allows null/long as input seed ## What changes were proposed in this pull request? This PR proposes `rand`/`randn` accept `null` as input in Scala/SQL and `LongType` as input in SQL. In this case, it treats the values as `0`. So, this PR includes both changes below: - `null` support It seems MySQL also accepts this. ``` sql mysql> select rand(0); +-+ | rand(0) | +-+ | 0.15522042769493574 | +-+ 1 row in set (0.00 sec) mysql> select rand(NULL); +-+ | rand(NULL) | +-+ | 0.15522042769493574 | +-+ 1 row in set (0.00 sec) ``` and also Hive does according to [HIVE-14694](https://issues.apache.org/jira/browse/HIVE-14694) So the codes below: ``` scala spark.range(1).selectExpr("rand(null)").show() ``` prints.. **Before** ``` Input argument to rand must be an integer literal.;; line 1 pos 0 org.apache.spark.sql.AnalysisException: Input argument to rand must be an integer literal.;; line 1 pos 0 at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:465) at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:444) ``` **After** ``` +---+ |rand(CAST(NULL AS INT))| +---+ |0.13385709732307427| +---+ ``` - `LongType` support in SQL. In addition, it make the function allows to take `LongType` consistently within Scala/SQL. In more details, the codes below: ``` scala spark.range(1).select(rand(1), rand(1L)).show() spark.range(1).selectExpr("rand(1)", "rand(1L)").show() ``` prints.. **Before** ``` +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ Input argument to rand must be an integer literal.;; line 1 pos 0 org.apache.spark.sql.AnalysisException: Input argument to rand must be an integer literal.;; line 1 pos 0 at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:465) at ``` **After** ``` +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ ``` ## How was this patch tested? Unit tests in `DataFrameSuite.scala` and `RandomSuite.scala`. Author: hyukjinkwonCloses #15432 from HyukjinKwon/SPARK-17854. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/340f09d1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/340f09d1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/340f09d1 Branch: refs/heads/master Commit: 340f09d100cb669bc6795f085aac6fa05630a076 Parents: 23ce0d1 Author: hyukjinkwon Authored: Sun Nov 6 14:11:37 2016 + Committer: Sean Owen Committed: Sun Nov 6 14:11:37 2016 + -- .../expressions/randomExpressions.scala | 50 +++- .../sql/catalyst/expressions/RandomSuite.scala | 6 ++ .../test/resources/sql-tests/inputs/random.sql | 17 .../resources/sql-tests/results/random.sql.out | 84 4 files changed, 135 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/340f09d1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala index a331a55..1d7a3c7 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala @@ -17,11 +17,10 @@ package org.apache.spark.sql.catalyst.expressions -import org.apache.spark.TaskContext import org.apache.spark.sql.AnalysisException import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext,
spark git commit: [SPARK-17854][SQL] rand/randn allows null/long as input seed
Repository: spark Updated Branches: refs/heads/branch-2.1 c42301f1e -> dcbf3fd4b [SPARK-17854][SQL] rand/randn allows null/long as input seed ## What changes were proposed in this pull request? This PR proposes `rand`/`randn` accept `null` as input in Scala/SQL and `LongType` as input in SQL. In this case, it treats the values as `0`. So, this PR includes both changes below: - `null` support It seems MySQL also accepts this. ``` sql mysql> select rand(0); +-+ | rand(0) | +-+ | 0.15522042769493574 | +-+ 1 row in set (0.00 sec) mysql> select rand(NULL); +-+ | rand(NULL) | +-+ | 0.15522042769493574 | +-+ 1 row in set (0.00 sec) ``` and also Hive does according to [HIVE-14694](https://issues.apache.org/jira/browse/HIVE-14694) So the codes below: ``` scala spark.range(1).selectExpr("rand(null)").show() ``` prints.. **Before** ``` Input argument to rand must be an integer literal.;; line 1 pos 0 org.apache.spark.sql.AnalysisException: Input argument to rand must be an integer literal.;; line 1 pos 0 at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:465) at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:444) ``` **After** ``` +---+ |rand(CAST(NULL AS INT))| +---+ |0.13385709732307427| +---+ ``` - `LongType` support in SQL. In addition, it make the function allows to take `LongType` consistently within Scala/SQL. In more details, the codes below: ``` scala spark.range(1).select(rand(1), rand(1L)).show() spark.range(1).selectExpr("rand(1)", "rand(1L)").show() ``` prints.. **Before** ``` +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ Input argument to rand must be an integer literal.;; line 1 pos 0 org.apache.spark.sql.AnalysisException: Input argument to rand must be an integer literal.;; line 1 pos 0 at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$5.apply(FunctionRegistry.scala:465) at ``` **After** ``` +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ +--+--+ | rand(1)| rand(1)| +--+--+ |0.2630967864682161|0.2630967864682161| +--+--+ ``` ## How was this patch tested? Unit tests in `DataFrameSuite.scala` and `RandomSuite.scala`. Author: hyukjinkwonCloses #15432 from HyukjinKwon/SPARK-17854. (cherry picked from commit 340f09d100cb669bc6795f085aac6fa05630a076) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dcbf3fd4 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dcbf3fd4 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dcbf3fd4 Branch: refs/heads/branch-2.1 Commit: dcbf3fd4bd42059aed9c966d4f0cdf58815eb802 Parents: c42301f Author: hyukjinkwon Authored: Sun Nov 6 14:11:37 2016 + Committer: Sean Owen Committed: Sun Nov 6 14:11:47 2016 + -- .../expressions/randomExpressions.scala | 50 +++- .../sql/catalyst/expressions/RandomSuite.scala | 6 ++ .../test/resources/sql-tests/inputs/random.sql | 17 .../resources/sql-tests/results/random.sql.out | 84 4 files changed, 135 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/dcbf3fd4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala index a331a55..1d7a3c7 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala @@ -17,11 +17,10 @@ package org.apache.spark.sql.catalyst.expressions -import org.apache.spark.TaskContext import org.apache.spark.sql.AnalysisException import