[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19776


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152429647
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,231 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+testTranslateFilter(EqualTo(attrInt, 1), Some(sources.EqualTo("cint", 
1)))
+testTranslateFilter(EqualTo(1, attrInt), Some(sources.EqualTo("cint", 
1)))
+
+testTranslateFilter(EqualNullSafe(attrStr, Literal(null)),
+  Some(sources.EqualNullSafe("cstr", null)))
+testTranslateFilter(EqualNullSafe(Literal(null), attrStr),
+  Some(sources.EqualNullSafe("cstr", null)))
+
+testTranslateFilter(GreaterThan(attrInt, 1), 
Some(sources.GreaterThan("cint", 1)))
+testTranslateFilter(GreaterThan(1, attrInt), 
Some(sources.LessThan("cint", 1)))
+
+testTranslateFilter(LessThan(attrInt, 1), 
Some(sources.LessThan("cint", 1)))
+testTranslateFilter(LessThan(1, attrInt), 
Some(sources.GreaterThan("cint", 1)))
+
+testTranslateFilter(GreaterThanOrEqual(attrInt, 1), 
Some(sources.GreaterThanOrEqual("cint", 1)))
+testTranslateFilter(GreaterThanOrEqual(1, attrInt), 
Some(sources.LessThanOrEqual("cint", 1)))
+
+testTranslateFilter(LessThanOrEqual(attrInt, 1), 
Some(sources.LessThanOrEqual("cint", 1)))
+testTranslateFilter(LessThanOrEqual(1, attrInt), 
Some(sources.GreaterThanOrEqual("cint", 1)))
+
+testTranslateFilter(InSet(attrInt, Set(1, 2, 3)), 
Some(sources.In("cint", Array(1, 2, 3
+
+testTranslateFilter(In(attrInt, Seq(1, 2, 3)), Some(sources.In("cint", 
Array(1, 2, 3
+
+testTranslateFilter(IsNull(attrInt), Some(sources.IsNull("cint")))
+testTranslateFilter(IsNotNull(attrInt), 
Some(sources.IsNotNull("cint")))
+
+// cint > 1 AND cint < 10
+testTranslateFilter(And(
+  GreaterThan(attrInt, 1),
+  LessThan(attrInt, 10)),
+  Some(sources.And(
+sources.GreaterThan("cint", 1),
+sources.LessThan("cint", 10
+
+// cint >= 8 OR cint <= 2
+testTranslateFilter(Or(
+  GreaterThanOrEqual(attrInt, 8),
+  LessThanOrEqual(attrInt, 2)),
+  Some(sources.Or(
+sources.GreaterThanOrEqual("cint", 8),
+sources.LessThanOrEqual("cint", 2
+
+testTranslateFilter(Not(GreaterThanOrEqual(attrInt, 8)),
+  Some(sources.Not(sources.GreaterThanOrEqual("cint", 8
+
+testTranslateFilter(StartsWith(attrStr, "a"), 
Some(sources.StringStartsWith("cstr", "a")))
+
+testTranslateFilter(EndsWith(attrStr, "a"), 
Some(sources.StringEndsWith("cstr", "a")))
+
+testTranslateFilter(Contains(attrStr, "a"), 
Some(sources.StringContains("cstr", "a")))
+  }
+
+  test("translate complex expression") {
+val attrInt = 'cint.int
+
+// ABS(cint) - 2 = 1
+testTranslateFilter(LessThanOrEqual(
+  // Expressions are not supported
--- End diff --

good catch @_@ fixed the typo. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152425778
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,231 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+testTranslateFilter(EqualTo(attrInt, 1), Some(sources.EqualTo("cint", 
1)))
+testTranslateFilter(EqualTo(1, attrInt), Some(sources.EqualTo("cint", 
1)))
+
+testTranslateFilter(EqualNullSafe(attrStr, Literal(null)),
+  Some(sources.EqualNullSafe("cstr", null)))
+testTranslateFilter(EqualNullSafe(Literal(null), attrStr),
+  Some(sources.EqualNullSafe("cstr", null)))
+
+testTranslateFilter(GreaterThan(attrInt, 1), 
Some(sources.GreaterThan("cint", 1)))
+testTranslateFilter(GreaterThan(1, attrInt), 
Some(sources.LessThan("cint", 1)))
+
+testTranslateFilter(LessThan(attrInt, 1), 
Some(sources.LessThan("cint", 1)))
+testTranslateFilter(LessThan(1, attrInt), 
Some(sources.GreaterThan("cint", 1)))
+
+testTranslateFilter(GreaterThanOrEqual(attrInt, 1), 
Some(sources.GreaterThanOrEqual("cint", 1)))
+testTranslateFilter(GreaterThanOrEqual(1, attrInt), 
Some(sources.LessThanOrEqual("cint", 1)))
+
+testTranslateFilter(LessThanOrEqual(attrInt, 1), 
Some(sources.LessThanOrEqual("cint", 1)))
+testTranslateFilter(LessThanOrEqual(1, attrInt), 
Some(sources.GreaterThanOrEqual("cint", 1)))
+
+testTranslateFilter(InSet(attrInt, Set(1, 2, 3)), 
Some(sources.In("cint", Array(1, 2, 3
+
+testTranslateFilter(In(attrInt, Seq(1, 2, 3)), Some(sources.In("cint", 
Array(1, 2, 3
+
+testTranslateFilter(IsNull(attrInt), Some(sources.IsNull("cint")))
+testTranslateFilter(IsNotNull(attrInt), 
Some(sources.IsNotNull("cint")))
+
+// cint > 1 AND cint < 10
+testTranslateFilter(And(
+  GreaterThan(attrInt, 1),
+  LessThan(attrInt, 10)),
+  Some(sources.And(
+sources.GreaterThan("cint", 1),
+sources.LessThan("cint", 10
+
+// cint >= 8 OR cint <= 2
+testTranslateFilter(Or(
+  GreaterThanOrEqual(attrInt, 8),
+  LessThanOrEqual(attrInt, 2)),
+  Some(sources.Or(
+sources.GreaterThanOrEqual("cint", 8),
+sources.LessThanOrEqual("cint", 2
+
+testTranslateFilter(Not(GreaterThanOrEqual(attrInt, 8)),
+  Some(sources.Not(sources.GreaterThanOrEqual("cint", 8
+
+testTranslateFilter(StartsWith(attrStr, "a"), 
Some(sources.StringStartsWith("cstr", "a")))
+
+testTranslateFilter(EndsWith(attrStr, "a"), 
Some(sources.StringEndsWith("cstr", "a")))
+
+testTranslateFilter(Contains(attrStr, "a"), 
Some(sources.StringContains("cstr", "a")))
+  }
+
+  test("translate complex expression") {
+val attrInt = 'cint.int
+
+// ABS(cint) - 2 = 1
+testTranslateFilter(LessThanOrEqual(
+  // Expressions are not supported
--- End diff --

?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152421950
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
--- End diff --

Thanks! I've followed your suggestion and the test suite looks cleaner now. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152422048
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152421982
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,9 +296,15 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152421773
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
--- End diff --

Thanks for the suggestion. Fixed. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152263922
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
--- End diff --

As you import `expressions._`, I think we can write `EqualTo` instead of 
`expressions.EqualTo` for catalyst predicates below?

Because you always write `sources.`EqualTo`, I think we don't confuse with 
them?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152263135
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
--- End diff --

Looks like we can have a small helper function:

```scala
def testTranslateFilter(catalystFilter: Expression, result: 
Option[sources.Filter]): Unit = {
  assertResult(result) {
DataSourceStrategy.translateFilter(catalystFilter)
  }
}
```

So the tests can be rewritten as:
```scala
testTranslateFilter(expressions.EqualTo(attrInt, 1), 
Some(sources.EqualTo("cint", 1)))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152261544
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,9 +296,15 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
--- End diff --

As I leave the comment in 
https://github.com/apache/spark/pull/10468#discussion_r151900175, the above 
test doesn't actually test against SPARK-12218 issue.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152260557
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,19 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 for detailed discussion
+// It is not safe to just convert one side if we do not understand 
the
+// other side. Here is an example used to explain the reason.
+// Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
+// and we do not understand how to convert trim(b) = 'blah'.
+// If we only convert a = 2, we will end up with
+// (a = 2) OR (c > 0), which will generate wrong results.
+// Pushing one leg of AND down is only safe to do at the top level.
+// You can see ParquetFilters' createFilter for more details.
+for {
+  leftFilter <- translateFilter(left)
+  rightFilter <- translateFilter(right)
+} yield sources.And(leftFilter, rightFilter)
--- End diff --

Although Catalyst predicate expressions are all converted to 
`sources.Filter` when we try to push down them. Not all convertible filters can 
be handled by Parquet and ORC. So I think we still can face the case only one 
sub-filter of `AND` can be pushed down by the file format.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152206765
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,307 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200656
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
--- End diff --

why do we need to test so many cases? as an end-to-end test, I think we 
only need a typical case.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200310
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200223
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200130
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200166
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200090
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152200107
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = 'cint.int
+val attrStr = 'cstr.string
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, 1))
+}
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(1, attrInt))
+}
+
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(attrStr, Literal(null)))
+}
+assertResult(Some(sources.EqualNullSafe("cstr", null))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualNullSafe(Literal(null), attrStr))
+}
+
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(1, attrInt))
+}
+
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThan(attrInt, 1))
+}
+assertResult(Some(sources.LessThan("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThan(1, attrInt))
+}
+
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.GreaterThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.LessThanOrEqual(attrInt, 1))
+}
+assertResult(Some(sources.LessThanOrEqual("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.GreaterThanOrEqual(1, attrInt))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.InSet(attrInt, Set(1, 2, 3)))
+}
+
+assertResult(Some(sources.In("cint", Array(1, 2, 3 {
+  DataSourceStrategy.translateFilter(
+expressions.In(attrInt, Seq(1, 2, 3)))
+}
+
+assertResult(Some(sources.IsNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNull(attrInt))
+}
+assertResult(Some(sources.IsNotNull("cint"))) {
+  DataSourceStrategy.translateFilter(
+expressions.IsNotNull(attrInt))
+}
+
+assertResult(Some(sources.And(
+  sources.GreaterThan("cint", 1),
+  sources.LessThan("cint", 10 {
+  DataSourceStrategy.translateFilter(expressions.And(
+expressions.GreaterThan(attrInt, 1),
+expressions.LessThan(attrInt, 10)
+  ))
+}
+
+assertResult(Some(sources.Or(
+  sources.GreaterThanOrEqual("cint", 8),
+  

[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152193847
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,306 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = AttributeReference("cint", IntegerType)()
+val attrStr = AttributeReference("cstr", StringType)()
--- End diff --

Fixed. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152193833
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,305 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.{sources, QueryTest}
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = AttributeReference("cint", IntegerType)()
+val attrStr = AttributeReference("cstr", StringType)()
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, Literal(1)))
--- End diff --

Fixed. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152192018
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,306 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.PlanTest
+import org.apache.spark.sql.sources
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends PlanTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = AttributeReference("cint", IntegerType)()
+val attrStr = AttributeReference("cstr", StringType)()
--- End diff --

```Scala
import org.apache.spark.sql.catalyst.dsl.expressions._
```

You can simplify your test cases. 

```Scala
val attrInt = 'cint.int
val attrStr = 'cstr.string
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152191858
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,305 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.{sources, QueryTest}
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = AttributeReference("cint", IntegerType)()
+val attrStr = AttributeReference("cstr", StringType)()
+
+assertResult(Some(sources.EqualTo("cint", 1))) {
+  DataSourceStrategy.translateFilter(
+expressions.EqualTo(attrInt, Literal(1)))
--- End diff --

No need to call `Literal` here. It will be implicitly casted to Literal
```
expressions.EqualTo(attrInt,1))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152191659
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,305 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.{sources, QueryTest}
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
+
+  test("translate simple expression") {
+val attrInt = AttributeReference("cint", IntegerType)()
+val attrStr = AttributeReference("cstr", StringType)()
--- End diff --

```Scala
import org.apache.spark.sql.catalyst.dsl.expressions._
```

You can simplify your test cases. 

```Scala
val attrInt = 'cint.int
val attrStr = 'cstr.string
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152191430
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,305 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.{sources, QueryTest}
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
--- End diff --

Fixed. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152190564
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ---
@@ -0,0 +1,305 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.{sources, QueryTest}
+import org.apache.spark.sql.catalyst.expressions
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types._
+
+
+class DataSourceStrategySuite extends QueryTest with SharedSQLContext {
--- End diff --

`extends PlanTest`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152186116
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
+  "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND 
NAME = 'fred')")
+
 assert(df1.collect.toSet === Set(Row("mary", 2)))
 assert(df2.collect.toSet === Set(Row("mary", 2)))
+assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df5.collect.toSet === Set(Row("mary", 2)))
+assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
--- End diff --

I went ahead and added a new ```DataSourceStrategySuite``` to test the 
```translateFilter```. Please free feel to let me know of any further comments. 
Thanks! 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152158653
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
+  "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND 
NAME = 'fred')")
+
 assert(df1.collect.toSet === Set(Row("mary", 2)))
 assert(df2.collect.toSet === Set(Row("mary", 2)))
+assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df5.collect.toSet === Set(Row("mary", 2)))
+assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
--- End diff --

@gatorsmile Thanks for comments, Sean =) You mean you'd like to see test 
cases calling ```DataSourceStrategy.translateFilter``` directly or end-to-end 
like ```sql("SELECT * FROM foobar WHERE THEID < 1")``` which goes thru the 
translateFilter code path to verify. I'm fine either way, just want to clarify. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152156940
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
+  "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND 
NAME = 'fred')")
+
 assert(df1.collect.toSet === Set(Row("mary", 2)))
 assert(df2.collect.toSet === Set(Row("mary", 2)))
+assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df5.collect.toSet === Set(Row("mary", 2)))
+assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
--- End diff --

They are end-to-end test cases. 

If you can, we should also add such a unit test suite. In the future, we 
can add more unit test cases for verifying more complex cases. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152156397
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,19 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 for detailed discussion
+// It is not safe to just convert one side if we do not understand 
the
+// other side. Here is an example used to explain the reason.
+// Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
+// and we do not understand how to convert trim(b) = 'blah'.
+// If we only convert a = 2, we will end up with
+// (a = 2) OR (c > 0), which will generate wrong results.
+// Pushing one leg of AND down is only safe to do at the top level.
+// You can see ParquetFilters' createFilter for more details.
+for {
+  leftFilter <- translateFilter(left)
+  rightFilter <- translateFilter(right)
+} yield sources.And(leftFilter, rightFilter)
--- End diff --

We do not need to clean up the codes in this PR. Let us minimize the code 
changes and it can simplify the backport.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152137096
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
+  "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND 
NAME = 'fred')")
+
 assert(df1.collect.toSet === Set(Row("mary", 2)))
 assert(df2.collect.toSet === Set(Row("mary", 2)))
+assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df5.collect.toSet === Set(Row("mary", 2)))
+assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
--- End diff --

Sure. I can help.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r152136470
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,19 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 for detailed discussion
+// It is not safe to just convert one side if we do not understand 
the
+// other side. Here is an example used to explain the reason.
+// Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
+// and we do not understand how to convert trim(b) = 'blah'.
+// If we only convert a = 2, we will end up with
+// (a = 2) OR (c > 0), which will generate wrong results.
+// Pushing one leg of AND down is only safe to do at the top level.
+// You can see ParquetFilters' createFilter for more details.
+for {
+  leftFilter <- translateFilter(left)
+  rightFilter <- translateFilter(right)
+} yield sources.And(leftFilter, rightFilter)
--- End diff --

I would think so.  SPARK-12218 put fixes into ParquetFilters.createFilter 
and OrcFilters.createFilter. They're similar to 
DataSourceStrategy.translateFilter but have different signature customized for 
Parquet and Orc. For all datasources including JDBC, Parquet, etc, 
translateFilter is called to determine if a predicate Expression can be pushed 
down as a Filter or not. Next for Parquet and ORC, Fitlers get mapped to 
Parquet or ORC specific filters with their own createFilter method. 

So this PR does help all datasources to get the correct set of push down 
predicates. Without this PR we simply got lucky with Parquet and ORC in terms 
of result correctness because 1) it looks like we always apply Filter on top of 
scan; 2) we end up with same number of or more rows returned with one leg 
missing from AND. 

JDBC data source does not always come with Filter on top of scan therefore 
exposed the bug. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151962055
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -296,8 +296,33 @@ class JDBCSuite extends SparkFunSuite
 // The older versions of spark have this kind of bugs in parquet data 
source.
 val df1 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2 AND NAME != 
'mary')")
 val df2 = sql("SELECT * FROM foobar WHERE NOT (THEID != 2) OR NOT 
(NAME != 'mary')")
+val df3 = sql("SELECT * FROM foobar WHERE (THEID > 0 AND NAME = 
'mary') OR (NAME = 'fred')")
+val df4 = sql("SELECT * FROM foobar " +
+  "WHERE (THEID > 0 AND TRIM(NAME) = 'mary') OR (NAME = 'fred')")
+val df5 = sql("SELECT * FROM foobar " +
+  "WHERE THEID > 0 AND TRIM(NAME) = 'mary' AND LENGTH(NAME) > 3")
+val df6 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR NAME = 'mary' OR NAME = 'fred'")
+val df7 = sql("SELECT * FROM foobar " +
+  "WHERE THEID < 0 OR TRIM(NAME) = 'mary' OR NAME = 'fred'")
+val df8 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR NAME != 
'fred'))")
+val df9 = sql("SELECT * FROM foobar " +
+  "WHERE NOT((THEID < 0 OR NAME != 'mary') AND (THEID != 1 OR 
TRIM(NAME) != 'fred'))")
+val df10 = sql("SELECT * FROM foobar " +
+  "WHERE (NOT(THEID < 0 OR TRIM(NAME) != 'mary')) OR (THEID = 1 AND 
NAME = 'fred')")
+
 assert(df1.collect.toSet === Set(Row("mary", 2)))
 assert(df2.collect.toSet === Set(Row("mary", 2)))
+assert(df3.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df4.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df5.collect.toSet === Set(Row("mary", 2)))
+assert(df6.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df7.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df8.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df9.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
+assert(df10.collect.toSet === Set(Row("fred", 1), Row("mary", 2)))
--- End diff --

I'd like to create a new `DataSourceStrategySuite` to test the 
`translateFilter`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151960214
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,19 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 for detailed discussion
+// It is not safe to just convert one side if we do not understand 
the
+// other side. Here is an example used to explain the reason.
+// Let's say we have (a = 2 AND trim(b) = 'blah') OR (c > 0)
+// and we do not understand how to convert trim(b) = 'blah'.
+// If we only convert a = 2, we will end up with
+// (a = 2) OR (c > 0), which will generate wrong results.
+// Pushing one leg of AND down is only safe to do at the top level.
+// You can see ParquetFilters' createFilter for more details.
+for {
+  leftFilter <- translateFilter(left)
+  rightFilter <- translateFilter(right)
+} yield sources.And(leftFilter, rightFilter)
--- End diff --

do we still need SPARK-12218 after this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-20 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151922970
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,11 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 and PR 10362 for detailed discussion
--- End diff --

@viirya I see. Thanks, Simon! I've removed the PR number from the comment.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151920447
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,10 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+for {
--- End diff --

Thanks. Just did that as you suggested.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151920212
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,11 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 and PR 10362 for detailed discussion
--- End diff --

Usually we don't list PR number but just JIRA number is enough.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151920126
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,11 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 and PR 10362 for detailed discussion
--- End diff --

Sure. I have added more comments there with an example. Thanks, Sean!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151917654
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,11 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+// See SPARK-12218 and PR 10362 for detailed discussion
--- End diff --

In the comment, you need to give an example to explain why. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151916997
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,10 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+for {
--- End diff --

Yeah. Follow what @yhuai  wrote in the PR 
https://github.com/apache/spark/pull/10362/files


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread jliwork
Github user jliwork commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151901160
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,10 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+for {
--- End diff --

Sure. Will do. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19776#discussion_r151900595
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -497,7 +497,10 @@ object DataSourceStrategy {
 Some(sources.IsNotNull(a.name))
 
   case expressions.And(left, right) =>
-(translateFilter(left) ++ 
translateFilter(right)).reduceOption(sources.And)
+for {
--- End diff --

Let's add a small comment like the PR you pointed out. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19776: [SPARK-22548][SQL] Incorrect nested AND expressio...

2017-11-17 Thread jliwork
GitHub user jliwork opened a pull request:

https://github.com/apache/spark/pull/19776

[SPARK-22548][SQL] Incorrect nested AND expression pushed down to JDB…

…C data source

## What changes were proposed in this pull request?

Let’s say I have a nested AND expression shown below and p2 can not be 
pushed down, 

(p1 AND p2) OR p3

In current Spark code, during data source filter translation, (p1 AND p2) 
is returned as p1 only and p2 is simply lost. This issue occurs with JDBC data 
source and is similar to SPARK-12218 for Parquet. When we have AND nested below 
another expression, we should either push both legs or nothing. Note that 1) 
the current Spark code will always split conjunctive predicate before it 
determines if a predicate can be pushed down or not; 2) the same translation 
method is also called by Data Source V2. 

## How was this patch tested?

Added new unit test cases to JDBCSuite


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jliwork/spark spark-22548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19776.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19776


commit 58de88c21210d469b8ef14b1f23764c31ca5651e
Author: Jia Li 
Date:   2017-11-18T01:15:01Z

[SPARK-22548][SQL] Incorrect nested AND expression pushed down to JDBC data 
source




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org