from:"maropu"

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-09 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r240073151
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -774,4 +774,23 @@ class InsertSuite extends QueryTest with 
TestHiveSingleton with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-25717: Insert overwrite a recreated external and partitioned 
table "
--- End diff --

How about `SPARK-25717: Insert overwrites remove old partition dirs 
correctly`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-08 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r240002357
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -227,18 +227,22 @@ case class InsertIntoHiveTable(
   // Newer Hive largely improves insert overwrite performance. As 
Spark uses older Hive
   // version and we may not want to catch up new Hive version 
every time. We delete the
   // Hive partition first and then load data file into the Hive 
partition.
-  if (oldPart.nonEmpty && overwrite) {
-oldPart.get.storage.locationUri.foreach { uri =>
-  val partitionPath = new Path(uri)
-  val fs = partitionPath.getFileSystem(hadoopConf)
-  if (fs.exists(partitionPath)) {
-if (!fs.delete(partitionPath, true)) {
-  throw new RuntimeException(
-"Cannot remove partition directory '" + 
partitionPath.toString)
-}
-// Don't let Hive do overwrite operation since it is 
slower.
-doHiveOverwrite = false
+  if (overwrite) {
+val oldPartitionPath = 
oldPart.flatMap(_.storage.locationUri.map(new Path(_)))
+  .getOrElse {
+ExternalCatalogUtils.generatePartitionPath(
+  partitionSpec,
+  partitionColumnNames,
+  new Path(table.location))
--- End diff --

We still need to consider the old path, `oldPart`? Can't we write this?;
```
val oldPartitionPath = 
ExternalCatalogUtils.generatePartitionPath(
  partitionSpec,
  partitionColumnNames,
  new Path(table.location))
```
Also, can you write a comment about how to solve this issue here and in the 
pr description?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-08 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r239997805
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -774,4 +774,23 @@ class InsertSuite extends QueryTest with 
TestHiveSingleton with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-25717: Insert overwrite a recreated external and partitioned 
table "
++ "should remove the historical partition first") {
+withTempDir { tmpDir =>
+  withTable("test_table") {
+(0 until 3).foreach { _ =>
+  sql("DROP TABLE IF EXISTS test_table")
+  sql(
+s"""
+   |CREATE EXTERNAL TABLE test_table (key int)
+   |PARTITIONED BY (p int)
+   |LOCATION 
'${tmpDir.toURI.toString.stripSuffix("/")}/test_table'
+   """.stripMargin)
+  sql("INSERT OVERWRITE TABLE test_table PARTITION(p=1) SELECT 1")
+}
+checkAnswer(sql("SELECT COUNT(*) FROM test_table"), Row(1))
+  }
+}
--- End diff --

How about wrigint tests for this?;
```
withTempDir { tmpDir =>
  withTable("test_table") {
// Prepare table data
sql(
  s"""
 |CREATE EXTERNAL TABLE test_table(key int)
 |PARTITIONED BY (p int)
 |LOCATION '${tmpDir.toURI}'
 """.stripMargin)
sql("INSERT INTO test_table PARTITION(p=1) SELECT 1")
checkAnswer(sql("SELECT COUNT(*) FROM test_table"), Row(1))

// Run the test...
sql("DROP TABLE test_table")
sql(
  s"""
 |CREATE EXTERNAL TABLE test_table(key int)
 |PARTITIONED BY (p int)
 |LOCATION '${tmpDir.toURI}'
 """.stripMargin)
sql("INSERT OVERWRITE TABLE test_table PARTITION(p=1) SELECT 1")
checkAnswer(sql("SELECT COUNT(*) FROM test_table"), Row(1))
  }
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r239996822
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -774,4 +774,23 @@ class InsertSuite extends QueryTest with 
TestHiveSingleton with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-25717: Insert overwrite a recreated external and partitioned 
table "
++ "should remove the historical partition first") {
+withTempDir { tmpDir =>
+  withTable("test_table") {
+(0 until 3).foreach { _ =>
+  sql("DROP TABLE IF EXISTS test_table")
+  sql(
+s"""
+   |CREATE EXTERNAL TABLE test_table (key int)
+   |PARTITIONED BY (p int)
+   |LOCATION 
'${tmpDir.toURI.toString.stripSuffix("/")}/test_table'
--- End diff --

nit: `|LOCATION '${tmpDir.toURI}'`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r239996528
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -774,4 +774,23 @@ class InsertSuite extends QueryTest with 
TestHiveSingleton with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-25717: Insert overwrite a recreated external and partitioned 
table "
--- End diff --

Can you make the title shorter in a single line?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22707: [SPARK-25717][SQL] Insert overwrite a recreated external...

2018-12-07 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22707
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23259: [SPARK-26215][SQL][WIP] Define reserved/non-reser...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23259#discussion_r239994423
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -769,7 +774,7 @@ nonReserved
 | REVOKE | GRANT | LOCK | UNLOCK | MSCK | REPAIR | RECOVER | EXPORT | 
IMPORT | LOAD | VALUES | COMMENT | ROLE
 | ROLES | COMPACTIONS | PRINCIPALS | TRANSACTIONS | INDEX | INDEXES | 
LOCKS | OPTION | LOCAL | INPATH
 | ASC | DESC | LIMIT | RENAME | SETS
-| AT | NULLS | OVERWRITE | ALL | ANY | ALTER | AS | BETWEEN | BY | 
CREATE | DELETE
+| AT | NULLS | OVERWRITE | ANY | ALTER | AS | BETWEEN | BY | CREATE | 
DELETE
--- End diff --

yea, thanks. you're right.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23259: [SPARK-26215][SQL][WIP] Define reserved/non-reserved key...

2018-12-07 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23259
  
To discuss this topic smoothly, I made this pr.
Any comment/suggestion is welcome.

cc: @gatorsmile @cloud-fan @viirya 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23259: [SPARK-26215][SQL][WIP] Define reserved/non-reser...

2018-12-07 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/23259

[SPARK-26215][SQL][WIP] Define reserved/non-reserved keywords based on the 
ANSI SQL standard

## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords based on the ANSI 
SQL standard.

TODO:
 - Where should we hanlde reserved key words?
 - Which SQL standard does Spark SQL follow (e.g., 2011 or 2016)?
 - Where should we docment the list of reserved/non-reserved key words?
 - Others?

## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-26215-WIP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23259


commit 01bc38347496a6194b46ace0feb7d2cd1adb614e
Author: Takeshi Yamamuro 
Date:   2018-12-06T08:04:49Z

WIP: SQL Reserved/Non-Reserved Key Words




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23258#discussion_r239993889
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -182,10 +182,13 @@ class SQLMetricsSuite extends SparkFunSuite with 
SQLMetricsTestUtils with Shared
   }
 
   test("Sort metrics") {
-// Assume the execution plan is
-// WholeStageCodegen(nodeId = 0, Range(nodeId = 2) -> Sort(nodeId = 1))
-val ds = spark.range(10).sort('id)
-testSparkPlanMetrics(ds.toDF(), 2, Map.empty)
+// Assume the execution plan with node id is
+// Sort(nodeId = 0)
+//   Exchange(nodeId = 1)
+// Range(nodeId = 2)
+val df = spark.range(9, -1, -1).sort('id).toDF()
--- End diff --

We need to use `range` here? How about just writing `Seq(1, 3, 2, 
...).toDF("id")`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23258#discussion_r239993718
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -182,10 +182,13 @@ class SQLMetricsSuite extends SparkFunSuite with 
SQLMetricsTestUtils with Shared
   }
 
   test("Sort metrics") {
-// Assume the execution plan is
-// WholeStageCodegen(nodeId = 0, Range(nodeId = 2) -> Sort(nodeId = 1))
-val ds = spark.range(10).sort('id)
-testSparkPlanMetrics(ds.toDF(), 2, Map.empty)
+// Assume the execution plan with node id is
+// Sort(nodeId = 0)
+//   Exchange(nodeId = 1)
+// Range(nodeId = 2)
+val df = spark.range(9, -1, -1).sort('id).toDF()
+testSparkPlanMetrics(df, 2, Map.empty)
+
df.queryExecution.executedPlan.find(_.isInstanceOf[SortExec]).getOrElse(assert(false))
--- End diff --


`assert(df.queryExecution.executedPlan.find(_.isInstanceOf[SortExec]).isDefined)`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23258#discussion_r239993561
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.SparkFunSuite
 import org.apache.spark.sql._
 import org.apache.spark.sql.catalyst.expressions.aggregate.{Final, Partial}
 import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
-import org.apache.spark.sql.execution.{FilterExec, RangeExec, SparkPlan, 
WholeStageCodegenExec}
+import org.apache.spark.sql.execution._
--- End diff --

nit: unfold?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metrics whi...

2018-12-07 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23258
  
cc: @mgaido91


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...

2018-12-06 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22141
  
nvm ;) we still have much time until the next release.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23206: [SPARK-26249][SQL] Add ability to inject a rule in order...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23206
  
What's a concrete example? IMHO the current proposed API is some 
complicated/cumbersome to users and I feel its error-prone. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22141
  
Any update?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21777: [WIP][SPARK-24498][SQL] Add JDK compiler for runtime cod...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21777
  
@kiszk Can you close this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
Anyway, if we can accept the additional test time, I think it is the best 
to run the tests on all the 4 patterns above for strict checks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23206: [SPARK-26249][SQL] Add ability to inject a rule in order...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23206
  
cc: @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23206: [SPARK-26249][SQL] Add ability to inject a rule in order...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23206
  
The current post hook is not enough for the use case you assume?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
In short, we need to run the tests on the three patterns below, right?
 - wholeStage=true, factoryMode=CODEGEN_ONLY (default behaviour in Spark)
 - wholeStage=false, factoryMode=CODEGEN_ONLY
 - wholeStage=false, factoryMode=NO_CODEGEN


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
yea, I think they're not totally the same..., but I'm not sure that the 
test run (`wholeStage=false, factoryMode=CODE_ONLY`) is worth the time cost.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
Sorry, my bad; it was longer than the current master by ~2 times. That's 
because the current master has already run two config set patterns 
(`wholeStage=true,factoryMode=CODEGEN_ONLY` and 
`wholeStage=true,factoryMode=NO_CODEGEN`) in `SQLQueryTestSuite`. The second 
test run (`wholeStage=true,factoryMode=NO_CODEGEN`) was introduced in my 
previous pr (#22512).

IMHO two config set patterns below could cover most code paths in Spark?
 - wholeStage=true, factoryMode=CODEGEN_ONLY
 - wholeStage=false, factoryMode=NO_CODEGEN

In this case, there is little change in the test time;
```
// the current master
=== Codegen/Interpreter Time Metrics ===
Total time: 358.584989321 seconds

Configs 
 Run Time 
spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=NO_CODEGEN  
 165961038511   

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY 
192623950810  
// with this pr
=== Codegen/Interpreter Time Metrics ===
Total time: 345.468455247 seconds

Configs 
 Run Time 

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY 
196572976377   
spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN 
 148895478870
```
WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
yea, it seems its longer by ~4 times;
```
23:25:43.880 WARN org.apache.spark.sql.SQLQueryTestSuite: 
=== Codegen/Interpreter Time Metrics ===
Total time: 602.64531157 seconds

Configs 
  Run Time (seconds) 

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=NO_CODEGEN  
  156414789416   

spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=CODEGEN_ONLY 
138343055840   

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY  
171905020550   
spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN 
  135982445764  
```

https://github.com/apache/spark/commit/7a69e0b6700fc5c7ad3acef35137f220b8804fd6



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
I'm looking into that now ;) Just give me more time to check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixe...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238905795
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala 
---
@@ -53,6 +55,133 @@ class ExplainSuite extends QueryTest with 
SharedSQLContext {
 checkKeywordsExistsInExplain(df,
   keywords = "InMemoryRelation", "StorageLevel(disk, memory, 
deserialized, 1 replicas)")
   }
+
+  test("optimized plan should show the rewritten aggregate expression") {
+withTempView("test_agg") {
+  sql(
+"""
+  |CREATE TEMPORARY VIEW test_agg AS SELECT * FROM VALUES
+  |  (1, true), (1, false),
+  |  (2, true),
+  |  (3, false), (3, null),
+  |  (4, null), (4, null),
+  |  (5, null), (5, true), (5, false) AS test_agg(k, v)
+""".stripMargin)
+
+  // simple explain of queries having every/some/any aggregates. 
Optimized
+  // plan should show the rewritten aggregate expression.
+  val df = sql("SELECT k, every(v), some(v), any(v) FROM test_agg 
GROUP BY k")
+  checkKeywordsExistsInExplain(df,
+"Aggregate [k#x], [k#x, min(v#x) AS every(v)#x, max(v#x) AS 
some(v)#x, " +
--- End diff --

I forgot to set true at extended in explain...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixe...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238899777
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2899,6 +2899,144 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  private def checkKeywordsExistsInExplain(df: DataFrame, keywords: 
String*): Unit = {
+val output = new java.io.ByteArrayOutputStream()
+Console.withOut(output) {
+  df.explain(extended = true)
+}
+val normalizedOutput = output.toString.replaceAll("#\\d+", "#x")
+for (key <- keywords) {
+  assert(normalizedOutput.contains(key))
+}
+  }
+
+  test("optimized plan should show the rewritten aggregate expression") {
--- End diff --

updated! Thanks, guys!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238895286
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -144,9 +144,10 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 val (comments, code) = input.split("\n").partition(_.startsWith("--"))
 
 // Runs all the tests on both codegen-only and interpreter modes
-val codegenConfigSets = Array(CODEGEN_ONLY, NO_CODEGEN).map {
-  case codegenFactoryMode =>
-Array(SQLConf.CODEGEN_FACTORY_MODE.key -> 
codegenFactoryMode.toString)
+val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", 
"CODEGEN_ONLY")).map {
--- End diff --

I will check the time later, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238894837
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

Aha, thanks. I remember that its related to SPARK-18134.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23194
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238630406
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2899,6 +2899,144 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  private def checkKeywordsExistsInExplain(df: DataFrame, keywords: 
String*): Unit = {
+val output = new java.io.ByteArrayOutputStream()
+Console.withOut(output) {
+  df.explain(extended = true)
+}
+val normalizedOutput = output.toString.replaceAll("#\\d+", "#x")
+for (key <- keywords) {
+  assert(normalizedOutput.contains(key))
+}
+  }
+
+  test("optimized plan should show the rewritten aggregate expression") {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
yea, its similar, but I personally think its orthogonal to SPARK-24562. 
This pr only targets a default config set for codegen-only and interpreter mode 
tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238625915
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -144,9 +144,10 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 val (comments, code) = input.split("\n").partition(_.startsWith("--"))
 
 // Runs all the tests on both codegen-only and interpreter modes
-val codegenConfigSets = Array(CODEGEN_ONLY, NO_CODEGEN).map {
-  case codegenFactoryMode =>
-Array(SQLConf.CODEGEN_FACTORY_MODE.key -> 
codegenFactoryMode.toString)
+val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", 
"CODEGEN_ONLY")).map {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238625477
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2899,6 +2899,144 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  private def checkKeywordsExistsInExplain(df: DataFrame, keywords: 
String*): Unit = {
+val output = new java.io.ByteArrayOutputStream()
+Console.withOut(output) {
+  df.explain(extended = true)
+}
+val normalizedOutput = output.toString.replaceAll("#\\d+", "#x")
+for (key <- keywords) {
+  assert(normalizedOutput.contains(key))
+}
+  }
+
+  test("optimized plan should show the rewritten aggregate expression") {
--- End diff --

all the tests?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20433
  
cc: @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
cc: @cloud-fan @mgaido91


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-04 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22468
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238543369
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

fixed code to use `ExpressionEvalHelper.checkResult`.

I don't remember correctly though, we might have some historical reasons 
about that; `ArrayBasedMapData` has no `hashCode` and `equals`. Probably, 
somebody might know this... cc: @hvanhovell @viirya


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238530264
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,100 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
+// we need to take care of it to compare rows.
+def toComparable(d: Any): Any = d match {
--- End diff --

Since we cannot compare `ArrayBasedMapData`s directly (that is, 
`assert(mapResultRow === mapExpectedRow)` fails), I just converted them into 
the `Seq`s of keys/values by this method.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-03 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/23213

[SPARK-26262][SQL] Run SQLQueryTestSuite with 
WHOLESTAGE_CODEGEN_ENABLED=false

## What changes were proposed in this pull request?
For better test coverage, this pr set `false` at 
`WHOLESTAGE_CODEGEN_ENABLED` for interpreter execution tests when running 
`SQLQueryTestSuite`. This pr moved the existing tests into `SQLQuerySuite` 
because explain output results are different between codegen-only and 
interpreter modes.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark InterpreterModeTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23213


commit 2ced0cab0c16ee7a2400035a5a7794033eae3ed9
Author: Takeshi Yamamuro 
Date:   2018-12-04T04:17:48Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23194
  
LGTM except for minor comments


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23194#discussion_r238526892
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -377,41 +377,41 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("CTAS a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
+  protected def withEmptyDirInTablePath(dirName: String)(f: File => Unit): 
Unit = {
--- End diff --

private?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty con...

2018-12-03 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/23212

[SPARK-25498][SQL][FOLLOW-UP] Return an empty config set when regenerating 
the golden files

## What changes were proposed in this pull request?
This pr is to return an empty config set when regenerating the golden files 
in `SQLQueryTestSuite`. 
This is the follow-up of  #22512.

## How was this patch tested?
N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-25498-FOLLOWUP

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23212


commit ca09bb82d49d29a5d6f088d2250e980bab64a8b7
Author: Takeshi Yamamuro 
Date:   2018-12-04T00:39:46Z

Fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23212
  
cc: @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20433
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238490121
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala ---
@@ -157,4 +157,22 @@ object InternalRow {
   getValueNullSafe
 }
   }
+
+  /**
+   * Returns a writer for an `InternalRow` with given data type.
+   */
+  def getWriter(ordinal: Int, dt: DataType): (InternalRow, Any) => Unit = 
dt match {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r238489997
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -148,12 +156,25 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
   })
   // When we are regenerating the golden files we don't need to run 
all the configs as they
   // all need to return the same result
-  if (regenerateGoldenFiles && configs.nonEmpty) {
-configs.take(1)
+  if (regenerateGoldenFiles) {
+if (configs.nonEmpty) {
+  configs.take(1)
+} else {
+  Array.empty[Array[(String, String)]]
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23194#discussion_r238256775
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -439,31 +440,22 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }.getMessage
   assert(ex.contains(exMsgWithDefaultDB))
 }
-  } finally {
-waitForTasksToFinish()
-Utils.deleteRecursively(tableLoc)
   }
 }
   }
 
   test("rename a managed table with existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab2")))
-try {
+withTableLocation("tab2") { tableLoc =>
   withTable("tab1") {
 sql(s"CREATE TABLE tab1 USING $dataSource AS SELECT 1, 'a'")
-tableLoc.mkdir()
 val ex = intercept[AnalysisException] {
   sql("ALTER TABLE tab1 RENAME TO tab2")
 }.getMessage
 val expectedMsg = "Can not rename the managed table('`tab1`'). The 
associated location"
 assert(ex.contains(expectedMsg))
   }
-} finally {
-  waitForTasksToFinish()
-  Utils.deleteRecursively(tableLoc)
 }
   }
-
--- End diff --

nit: revert this


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23194#discussion_r238256611
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -377,41 +377,42 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("CTAS a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
-try {
+  protected def withTableLocation(tableNames: String)(f: File => Unit): 
Unit = {
+val tableLoc =
+  new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier(tableNames)))
+try
--- End diff --

nit `try {`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23194#discussion_r238256563
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -377,41 +377,42 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("CTAS a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
-try {
+  protected def withTableLocation(tableNames: String)(f: File => Unit): 
Unit = {
--- End diff --

How about `withEmptyDirInTablePath(dirName: String)`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22318
  
In the example @viirya described above 
(https://github.com/apache/spark/pull/22318#issuecomment-426317617), I think 
the interpretation is unclear to most users and I'm fairly concerned that it 
could be error-prone...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23094: [SPARK-26077][SQL] Reserved SQL words are not esc...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23094#discussion_r238239581
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -87,11 +87,32 @@ abstract class JdbcDialect extends Serializable {
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
   /**
-   * Quotes the identifier. This is used to put quotes around the 
identifier in case the column
-   * name is a reserved keyword, or in case it contains characters that 
require quotes (e.g. space).
+   * Gets the character used for identifier quoting.
+   */
+  def getIdentifierQuoteCharacter: String = """""""
--- End diff --

btw, I feel we need more general name handling here like 
`UnresolvedAttribute. parseAttributeName` 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala#L151

cc: @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23094: [SPARK-26077][SQL] Reserved SQL words are not esc...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23094#discussion_r238239195
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -87,11 +87,32 @@ abstract class JdbcDialect extends Serializable {
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
   /**
-   * Quotes the identifier. This is used to put quotes around the 
identifier in case the column
-   * name is a reserved keyword, or in case it contains characters that 
require quotes (e.g. space).
+   * Gets the character used for identifier quoting.
+   */
+  def getIdentifierQuoteCharacter: String = """""""
--- End diff --

I like a simpler API design; how about splitting an identifier into the two 
parts (db and table names) outside `JdbcDialect`? Then, how about applying 
`quoteIdentifer` into each name part?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23094: [SPARK-26077][SQL] Reserved SQL words are not esc...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23094#discussion_r238232050
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -87,11 +87,32 @@ abstract class JdbcDialect extends Serializable {
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
   /**
-   * Quotes the identifier. This is used to put quotes around the 
identifier in case the column
-   * name is a reserved keyword, or in case it contains characters that 
require quotes (e.g. space).
+   * Gets the character used for identifier quoting.
+   */
+  def getIdentifierQuoteCharacter: String = """""""
--- End diff --

We need this new API instead of `quoteIdentifier`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23094: [SPARK-26077][SQL] Reserved SQL words are not escaped by...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23094
  
Also, can you add tests in MySQLIntegrationSuite, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23094: [SPARK-26077][SQL] Reserved SQL words are not escaped by...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23094
  
@golovan thanks for your work. Can you make the title complete (... -> for 
table names)?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23094: [SPARK-26077][SQL] Reserved SQL words are not escaped by...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23094
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22468: [SPARK-25374][SQL] SafeProjection supports fallback to a...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22468
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22204
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22198
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-03 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20433
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r238177033
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -148,12 +156,21 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
   })
   // When we are regenerating the golden files we don't need to run 
all the configs as they
   // all need to return the same result
-  if (regenerateGoldenFiles && configs.nonEmpty) {
+  if (regenerateGoldenFiles) {
 configs.take(1)
--- End diff --

For better readability, fixed in 
https://github.com/apache/spark/pull/22512/commits/4cdc5040feb3da1e4cf9efcf434138d5873fae04


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r238176184
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -64,7 +85,7 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
 i = 0
 while (i < validExprs.length) {
   val (_, ordinal) = validExprs(i)
-  mutableRow(ordinal) = buffer(ordinal)
+  fieldWriters(i)(buffer(ordinal))
--- End diff --

fixed in 
https://github.com/apache/spark/pull/22512/commits/95411c8b8b76503dff93756482642083a694b0b7


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-12-03 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r238175630
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -148,12 +156,21 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
   })
   // When we are regenerating the golden files we don't need to run 
all the configs as they
   // all need to return the same result
-  if (regenerateGoldenFiles && configs.nonEmpty) {
+  if (regenerateGoldenFiles) {
 configs.take(1)
--- End diff --

Actually, it returns an empty array?
```
scala> Array.empty.take(1)
res0: Array[Nothing] = Array()

scala> Seq.empty.take(1)
res1: Seq[Nothing] = List()
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22204
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22198
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional...

2018-12-02 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20433#discussion_r238152087
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -335,6 +335,12 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val ANSI_SQL_PARSER =
+buildConf("spark.sql.parser.ansi.enabled")
+  .doc("When true, tries to conform to ANSI SQL syntax.")
+  .booleanConf
+  .createWithDefault(false)
--- End diff --

updated


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20433
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-12-02 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r238151565
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -64,7 +85,7 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
 i = 0
 while (i < validExprs.length) {
   val (_, ordinal) = validExprs(i)
-  mutableRow(ordinal) = buffer(ordinal)
+  fieldWriters(i)(buffer(ordinal))
--- End diff --

ah, sounds reasonable. I'll update later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22999: [SPARK-20319][SQL] Already quoted identifiers are gettin...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22999
  
This is only the Oracle dialect issue? How about other dialects?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22318
  
@peter-toth we cannot fix the issue of the description without changing 
[the existing 
behaviour](https://github.com/apache/spark/pull/22318#issuecomment-426317617)? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23032: [WIP][SPARK-26061][SQL][MINOR] Reduce the number of unus...

2018-12-02 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23032
  
We cannot fix the root cause in `GenerateUnsafeProjection.createCode()`? I 
think, since we don't check if all the mutable variables are used in gen'd 
code, the similar issue can easily happen in other places...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23199: [SPARK-26245][SQL] Add Float literal

2018-12-02 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23199#discussion_r238130900
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -1045,6 +1046,11 @@ DOUBLE_LITERAL
 | DECIMAL_DIGITS EXPONENT? 'D' {isValidDecimal()}?
 ;
 
+FLOAT_LITERAL
+: DIGIT+ EXPONENT? 'F'
+| DECIMAL_DIGITS EXPONENT? 'F' {isValidDecimal()}?
--- End diff --

This is implementation-specific;
```
hive> create temporary table t1 as select 1.0F;
hive> describe t1;
OK
f   decimal(1,0)  

hive> create temporary table t2 as select 1.0D;
hive> describe t2;
OK
_c0 double  
```
```
postgres=# create temporary table t1 as select 1.0F;
postgres=# \d t1
 f  | numeric | 

postgres=# create temporary table t2 as select 1.0D;
postgres=# \d t2
 d  | numeric | 

```
```
mysql> create temporary table t1 as select 1.0F;
mysql> describe t1;
+---+--+--+-+-+---+
| Field | Type | Null | Key | Default | Extra |
+---+--+--+-+-+---+
| F | decimal(2,1) | NO   | | 0.0 | NULL  |
+---+--+--+-+-+---+

mysql> create temporary table t2 as select 1.0D;
mysql> describe t2;
+---+--+--+-+-+---+
| Field | Type | Null | Key | Default | Extra |
+---+--+--+-+-+---+
| D | decimal(2,1) | NO   | | 0.0 | NULL  |
+---+--+--+-+-+---+
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-11-25 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22512
  
Yea, I'll update in a few days.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-25 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22512
  
I'm looking into the failure reason... (passed in the local, but failed in 
the jenkins...)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227626505
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MutableProjectionSuite.scala
 ---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.CalendarInterval
+
+class MutableProjectionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
+
+  private def createMutableProjection(dataTypes: Array[DataType]): 
MutableProjection = {
+MutableProjection.create(dataTypes.zipWithIndex.map(x => 
BoundReference(x._2, x._1, true)))
+  }
+
+  testBothCodegenAndInterpreted("fixed-length types") {
+val fixedLengthTypes = Array[DataType](
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType, 
DoubleType,
+  DateType, TimestampType)
+val proj = createMutableProjection(fixedLengthTypes)
+val inputRow = InternalRow.fromSeq(
+  Seq(false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 5.0, 100, 200L))
+assert(proj(inputRow) === inputRow)
+
+// Use UnsafeRow as buffer
+val numBytes = 
UnsafeRow.calculateBitSetWidthInBytes(fixedLengthTypes.length)
+val unsafeBuffer = UnsafeRow.createFromByteArray(numBytes, 
fixedLengthTypes.length)
+val projUnsafeRow = proj.target(unsafeBuffer)(inputRow)
+assert(FromUnsafeProjection(fixedLengthTypes)(projUnsafeRow) === 
inputRow)
+  }
+
+  testBothCodegenAndInterpreted("variable-length types") {
+val variableLengthTypes = Array(
+  StringType, DecimalType.defaultConcreteType, CalendarIntervalType, 
BinaryType,
+  ArrayType(StringType), MapType(IntegerType, StringType),
+  StructType.fromDDL("a INT, b STRING"), 
ObjectType(classOf[java.lang.Integer]))
+val proj = createMutableProjection(variableLengthTypes)
--- End diff --

sure.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227626456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala ---
@@ -143,4 +144,25 @@ object InternalRow {
 case u: UserDefinedType[_] => getAccessor(u.sqlType)
 case _ => (input, ordinal) => input.get(ordinal, dataType)
   }
+
+  /**
+   * Returns a writer for an `InternalRow` with given data type.
+   */
+  def getWriter(ordinal: Int, dt: DataType): (InternalRow, Any) => Unit = 
dt match {
+case BooleanType => (input, v) => input.setBoolean(ordinal, 
v.asInstanceOf[Boolean])
+case ByteType => (input, v) => input.setByte(ordinal, 
v.asInstanceOf[Byte])
+case ShortType => (input, v) => input.setShort(ordinal, 
v.asInstanceOf[Short])
+case IntegerType | DateType => (input, v) => input.setInt(ordinal, 
v.asInstanceOf[Int])
+case LongType | TimestampType => (input, v) => input.setLong(ordinal, 
v.asInstanceOf[Long])
+case FloatType => (input, v) => input.setFloat(ordinal, 
v.asInstanceOf[Float])
+case DoubleType => (input, v) => input.setDouble(ordinal, 
v.asInstanceOf[Double])
+case DecimalType.Fixed(precision, _) =>
+  (input, v) => input.setDecimal(ordinal, v.asInstanceOf[Decimal], 
precision)
+case CalendarIntervalType | BinaryType | _: ArrayType | StringType | 
_: StructType |
+ _: MapType | _: ObjectType =>
+  (input, v) => input.update(ordinal, v)
+case udt: UserDefinedType[_] => getWriter(ordinal, udt.sqlType)
+case NullType => (input, _) => input.setNullAt(ordinal)
+case _ => throw new SparkException(s"Unsupported data type $dt")
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-10-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21860
  
LGTM, pending Jenkins


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-10-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21860
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-10-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21860
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...

2018-10-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22755
  
It looks good to improve the test coverage. But, it seems to be hard to 
wrap all the test case with `withSQLConf`. So, how about adding a helper 
function for turning off/on codegen (`WHOLESTAGE_CODEGEN_ENABLED` and 
`CODEGEN_FACTORY_MODE`) like `testBothCodegenAndInterpreted`? @cloud-fan 
@gatorsmile 


https://github.com/apache/spark/blob/584e767d372d41071c3436f9ad4bf77a820f12b4/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala#L38


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17174: [SPARK-19145][SQL] Timestamp to String casting is slowin...

2018-10-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17174
  
@dongjoon-hyun Thanks for your checks!! I still wait for other developer's 
feedbacks. If the approach is positive, I'll make a pr. Also, welcome another 
idea to solve this. cc: @gatorsmile @hvanhovell @viirya


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227609747
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala ---
@@ -143,4 +144,24 @@ object InternalRow {
 case u: UserDefinedType[_] => getAccessor(u.sqlType)
 case _ => (input, ordinal) => input.get(ordinal, dataType)
   }
+
+  /**
+   * Returns a writer for an `InternalRow` with given data type.
+   */
+  def getWriter(ordinal: Int, dt: DataType): (InternalRow, Any) => Unit = 
dt match {
+case BooleanType => (input, v) => input.setBoolean(ordinal, 
v.asInstanceOf[Boolean])
+case ByteType => (input, v) => input.setByte(ordinal, 
v.asInstanceOf[Byte])
+case ShortType => (input, v) => input.setShort(ordinal, 
v.asInstanceOf[Short])
+case IntegerType | DateType => (input, v) => input.setInt(ordinal, 
v.asInstanceOf[Int])
+case LongType | TimestampType => (input, v) => input.setLong(ordinal, 
v.asInstanceOf[Long])
+case FloatType => (input, v) => input.setFloat(ordinal, 
v.asInstanceOf[Float])
+case DoubleType => (input, v) => input.setDouble(ordinal, 
v.asInstanceOf[Double])
+case DecimalType.Fixed(precision, _) =>
+  (input, v) => input.setDecimal(ordinal, v.asInstanceOf[Decimal], 
precision)
+case CalendarIntervalType | BinaryType | _: ArrayType | StringType | 
_: StructType |
+ _: MapType | _: ObjectType | _: UserDefinedType[_] =>
+  (input, v) => input.update(ordinal, v)
+case NullType => (input, v) => {}
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-23 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227609600
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala ---
@@ -143,4 +144,24 @@ object InternalRow {
 case u: UserDefinedType[_] => getAccessor(u.sqlType)
 case _ => (input, ordinal) => input.get(ordinal, dataType)
   }
+
+  /**
+   * Returns a writer for an `InternalRow` with given data type.
+   */
+  def getWriter(ordinal: Int, dt: DataType): (InternalRow, Any) => Unit = 
dt match {
+case BooleanType => (input, v) => input.setBoolean(ordinal, 
v.asInstanceOf[Boolean])
+case ByteType => (input, v) => input.setByte(ordinal, 
v.asInstanceOf[Byte])
+case ShortType => (input, v) => input.setShort(ordinal, 
v.asInstanceOf[Short])
+case IntegerType | DateType => (input, v) => input.setInt(ordinal, 
v.asInstanceOf[Int])
+case LongType | TimestampType => (input, v) => input.setLong(ordinal, 
v.asInstanceOf[Long])
+case FloatType => (input, v) => input.setFloat(ordinal, 
v.asInstanceOf[Float])
+case DoubleType => (input, v) => input.setDouble(ordinal, 
v.asInstanceOf[Double])
+case DecimalType.Fixed(precision, _) =>
+  (input, v) => input.setDecimal(ordinal, v.asInstanceOf[Decimal], 
precision)
+case CalendarIntervalType | BinaryType | _: ArrayType | StringType | 
_: StructType |
+ _: MapType | _: ObjectType | _: UserDefinedType[_] =>
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227226902
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -140,6 +141,14 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 val input = fileToString(new File(testCase.inputFile))
 
 val (comments, code) = input.split("\n").partition(_.startsWith("--"))
+
+// Runs all the tests on both codegen-only and interpreter modes. 
Since explain results differ
+// when `WHOLESTAGE_CODEGEN_ENABLED` disabled, we don't run these 
tests now.
+val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", 
"CODEGEN_ONLY")).map {
+  case (wholeStageCodegenEnabled, codegenFactoryMode) =>
+Array( // SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> 
wholeStageCodegenEnabled,
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22512: [SPARK-25498][SQL] InterpretedMutableProjection should h...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22512
  
ok, I'll add tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227224209
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
   def currentValue: InternalRow = mutableRow
 
   override def target(row: InternalRow): MutableProjection = {
+// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts 
fixed-length types only
+assert(!row.isInstanceOf[UnsafeRow] ||
+  validExprs.forall { case (e, _) => 
UnsafeRow.isFixedLength(e.dataType) })
 mutableRow = row
 this
   }
 
+  private[this] val fieldWriters = validExprs.map { case (e, i) =>
+val writer = generateRowWriter(i, e.dataType)
+if (!e.nullable) {
+  (v: Any) => writer(v)
+} else {
+  (v: Any) => {
+if (v == null) {
+  mutableRow.setNullAt(i)
+} else {
+  writer(v)
+}
+  }
+}
+  }
+
+  private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = 
dt match {
--- End diff --

oh, yes! yea, I will.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22512#discussion_r227224030
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InterpretedMutableProjection.scala
 ---
@@ -49,10 +51,54 @@ class InterpretedMutableProjection(expressions: 
Seq[Expression]) extends Mutable
   def currentValue: InternalRow = mutableRow
 
   override def target(row: InternalRow): MutableProjection = {
+// If `mutableRow` is `UnsafeRow`, `MutableProjection` accepts 
fixed-length types only
+assert(!row.isInstanceOf[UnsafeRow] ||
+  validExprs.forall { case (e, _) => 
UnsafeRow.isFixedLength(e.dataType) })
 mutableRow = row
 this
   }
 
+  private[this] val fieldWriters = validExprs.map { case (e, i) =>
+val writer = generateRowWriter(i, e.dataType)
+if (!e.nullable) {
+  (v: Any) => writer(v)
+} else {
+  (v: Any) => {
+if (v == null) {
+  mutableRow.setNullAt(i)
+} else {
+  writer(v)
+}
+  }
+}
+  }
+
+  private def generateRowWriter(ordinal: Int, dt: DataType): Any => Unit = 
dt match {
+case BooleanType =>
+  v => mutableRow.setBoolean(ordinal, v.asInstanceOf[Boolean])
+case ByteType =>
+  v => mutableRow.setByte(ordinal, v.asInstanceOf[Byte])
+case ShortType =>
+  v => mutableRow.setShort(ordinal, v.asInstanceOf[Short])
+case IntegerType | DateType =>
+  v => mutableRow.setInt(ordinal, v.asInstanceOf[Int])
+case LongType | TimestampType =>
+  v => mutableRow.setLong(ordinal, v.asInstanceOf[Long])
+case FloatType =>
+  v => mutableRow.setFloat(ordinal, v.asInstanceOf[Float])
+case DoubleType =>
+  v => mutableRow.setDouble(ordinal, v.asInstanceOf[Double])
+case DecimalType.Fixed(precision, _) =>
+  v => mutableRow.setDecimal(ordinal, v.asInstanceOf[Decimal], 
precision)
+case CalendarIntervalType | BinaryType | _: ArrayType | StringType | 
_: StructType |
+ _: MapType | _: UserDefinedType[_] =>
+  v => mutableRow.update(ordinal, v)
+case NullType =>
+  v => {}
--- End diff --

We need to take care of `e.nullable && e.dataType == NullType` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22755: [SPARK-25755][SQL][Test] Supplementation of non-C...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22755#discussion_r227215773
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
 ---
@@ -122,19 +122,22 @@ class ExistenceJoinSuite extends SparkPlanTest with 
SharedSQLContext {
 
 test(s"$testName using BroadcastHashJoin") {
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _) =>
-withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
-  checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: 
SparkPlan) =>
-EnsureRequirements(left.sqlContext.sessionState.conf).apply(
-  BroadcastHashJoinExec(
-leftKeys, rightKeys, joinType, BuildRight, boundCondition, 
left, right)),
-expectedAnswer,
-sortAnswers = true)
-  checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: 
SparkPlan) =>
-EnsureRequirements(left.sqlContext.sessionState.conf).apply(
-  createLeftSemiPlusJoin(BroadcastHashJoinExec(
-leftKeys, rightKeys, leftSemiPlus, BuildRight, 
boundCondition, left, right))),
-expectedAnswer,
-sortAnswers = true)
+Seq("false", "true").foreach { v =>
--- End diff --

nit: v -> codegenEnabled


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22755: [SPARK-25755][SQL][Test] Supplementation of non-CodeGen ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22755
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227215135
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
+  val relation = LocalRelation('a.int, 'b.int)
+  val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
 
-val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
+  val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
 
-val optimized = Optimize.execute(query.analyze)
-val correctAnswer = relation
-  .select('a)
-  .join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
-  .analyze
+  val optimized = Optimize.execute(query.analyze)
+  val correctAnswer = relation
+.select('a)
+.join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
+.analyze
 
-comparePlans(optimized, correctAnswer)
+  comparePlans(optimized, correctAnswer)
+}
+  }
+
+  test("Infer filters and push down predicate after rewriting predicate 
subquery") {
--- End diff --

How about making the test title simple, then leaving comments about what's 
tested clearly here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227214593
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
+  val relation = LocalRelation('a.int, 'b.int)
+  val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
 
-val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
+  val query = 
relation.where('a.in(ListQuery(relInSubquery.select('x.select('a)
 
-val optimized = Optimize.execute(query.analyze)
-val correctAnswer = relation
-  .select('a)
-  .join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
-  .analyze
+  val optimized = Optimize.execute(query.analyze)
+  val correctAnswer = relation
+.select('a)
+.join(relInSubquery.select('x), LeftSemi, Some('a === 'x))
+.analyze
 
-comparePlans(optimized, correctAnswer)
+  comparePlans(optimized, correctAnswer)
+}
+  }
+
+  test("Infer filters and push down predicate after rewriting predicate 
subquery") {
--- End diff --

Need the column pruning in the test title?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227214404
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
--- End diff --

Ah, I see. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22778#discussion_r227205054
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
 ---
@@ -33,23 +34,44 @@ class RewriteSubquerySuite extends PlanTest {
   Batch("Rewrite Subquery", FixedPoint(1),
 RewritePredicateSubquery,
 ColumnPruning,
+InferFiltersFromConstraints,
+PushDownPredicate,
 CollapseProject,
 RemoveRedundantProject) :: Nil
   }
 
   test("Column pruning after rewriting predicate subquery") {
-val relation = LocalRelation('a.int, 'b.int)
-val relInSubquery = LocalRelation('x.int, 'y.int, 'z.int)
+withSQLConf(SQLConf.CONSTRAINT_PROPAGATION_ENABLED.key -> "false") {
--- End diff --

We need to modify this existing test?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22778
  
Also, to make sure no performance regression in the optimizer, can you 
check optimizer statistics in TPCDS by running `TPCDSQuerySuite`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22778: [SPARK-25784][SQL] Infer filters from constraints after ...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22778
  
Can you put the concrete example of the missing case you described in the 
PR description?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22712: [SPARK-25724] Add sorting functionality in MapTyp...

2018-10-22 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22712#discussion_r227201626
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ordering.scala
 ---
@@ -53,6 +53,10 @@ class InterpretedOrdering(ordering: Seq[SortOrder]) 
extends Ordering[InternalRow
 
a.interpretedOrdering.asInstanceOf[Ordering[Any]].compare(left, right)
   case a: ArrayType if order.direction == Descending =>
 
a.interpretedOrdering.asInstanceOf[Ordering[Any]].reverse.compare(left, right)
+  case m: MapType if m.isOrdered && order.direction == Ascending =>
+
m.interpretedOrdering.asInstanceOf[Ordering[Any]].compare(left, right)
+  case m: MapType if m.isOrdered && order.direction == Descending 
=>
+
m.interpretedOrdering.asInstanceOf[Ordering[Any]].reverse.compare(left, right)
--- End diff --

You mean we can't remove this? If not necessary, better to remove it off.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22219
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-10-22 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20433
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3605 matches

Mail list logo