[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68889173
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathExpressionSuite.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, 
Literal, NonFoldableLiteral}
+import org.apache.spark.sql.types.StringType
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Test suite for various xpath functions.
+ */
+class XPathExpressionSuite extends SparkFunSuite with ExpressionEvalHelper 
{
+
+  private def testBoolean[T](xml: String, path: String, expected: T): Unit 
= {
+checkEvaluation(
+  XPathBoolean(Literal.create(xml, StringType), Literal.create(path, 
StringType)),
+  expected)
+  }
+
+  test("xpath_boolean") {
+testBoolean("b", "a/b", true)
+testBoolean("b", "a/c", false)
+testBoolean("b", "a/b = \"b\"", true)
+testBoolean("b", "a/b = \"c\"", false)
+testBoolean("10", "a/b < 10", false)
+testBoolean("10", "a/b = 10", true)
+
+// null input
+testBoolean(null, null, null)
+testBoolean(null, "a", null)
+testBoolean("10", null, null)
+
+// exception handling for invalid input
+intercept[Exception] {
+  testBoolean("/a>", "a", null)
+}
+  }
+
+  test("xpath_boolean path cache invalidation") {
+// This is a test to ensure the expression is not reusing the path for 
different strings
+val xml = NonFoldableLiteral("b")
+val path = NonFoldableLiteral("a/b")
+val expr = XPathBoolean(xml, path)
+
+// Run evaluation once
+assert(expr.eval(null) == true)
+
+// Change the input path and make sure we don't screw up caching
+path.value = UTF8String.fromString("a/c")
+assert(expr.eval(null) == false)
--- End diff --

updated!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13930: [SPARK-16228][SQL] HiveSessionCatalog should return `dou...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13930
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13930: [SPARK-16228][SQL] HiveSessionCatalog should return `dou...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13930
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61446/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13930: [SPARK-16228][SQL] HiveSessionCatalog should return `dou...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13930
  
**[Test build #61446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61446/consoleFull)**
 for PR 13930 at commit 
[`b8df028`](https://github.com/apache/spark/commit/b8df0284aa7bd4328ff7f8e1ebdce55272e549d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r6729
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathExpressionSuite.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, 
Literal, NonFoldableLiteral}
+import org.apache.spark.sql.types.StringType
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Test suite for various xpath functions.
+ */
+class XPathExpressionSuite extends SparkFunSuite with ExpressionEvalHelper 
{
+
+  private def testBoolean[T](xml: String, path: String, expected: T): Unit 
= {
+checkEvaluation(
+  XPathBoolean(Literal.create(xml, StringType), Literal.create(path, 
StringType)),
+  expected)
+  }
+
+  test("xpath_boolean") {
+testBoolean("b", "a/b", true)
+testBoolean("b", "a/c", false)
+testBoolean("b", "a/b = \"b\"", true)
+testBoolean("b", "a/b = \"c\"", false)
+testBoolean("10", "a/b < 10", false)
+testBoolean("10", "a/b = 10", true)
+
+// null input
+testBoolean(null, null, null)
+testBoolean(null, "a", null)
+testBoolean("10", null, null)
+
+// exception handling for invalid input
+intercept[Exception] {
+  testBoolean("/a>", "a", null)
+}
+  }
+
+  test("xpath_boolean path cache invalidation") {
+// This is a test to ensure the expression is not reusing the path for 
different strings
+val xml = NonFoldableLiteral("b")
+val path = NonFoldableLiteral("a/b")
+val expr = XPathBoolean(xml, path)
+
+// Run evaluation once
+assert(expr.eval(null) == true)
+
+// Change the input path and make sure we don't screw up caching
+path.value = UTF8String.fromString("a/c")
+assert(expr.eval(null) == false)
--- End diff --

To test the changing of input, I think it's more clear to use 
`BoundReference`:
```
val expr = XPathBoolean(Literal("b"), 'path.string.at(1))
checkEvaluation(expr, true, create_row("a/b"))
checkEvaluation(expr, false, create_row("a/c"))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13860: [SPARK-16157] [SQL] Add New Methods for comments in Stru...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13860
  
**[Test build #61451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61451/consoleFull)**
 for PR 13860 at commit 
[`b6ded4d`](https://github.com/apache/spark/commit/b6ded4d5381378781a41645950c348095cbf1292).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13886: [SPARK-16185] [SQL] Better Error Messages When Creating ...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13886
  
**[Test build #61450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61450/consoleFull)**
 for PR 13886 at commit 
[`e4cc35d`](https://github.com/apache/spark/commit/e4cc35d88e87941d8b9d2e3a2f754a43d29d4c70).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r6152
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala ---
@@ -52,6 +52,23 @@ class DataTypeSuite extends SparkFunSuite {
 assert(StructField("b", LongType, false) === struct("b"))
   }
 
+  test("construct with add from StructField with comments") {
+// Test creation from StructField using four different ways
+val struct = (new StructType)
+  .add("a", "int", true, "test1")
+  .add("c", StringType, true, "test3")
--- End diff --

Sure. Will do it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13966#discussion_r6095
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -162,6 +162,42 @@ case class ConcatWs(children: Seq[Expression])
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(n, str1, str2, ...) - returns the n-th string",
+  extended = "> SELECT _FUNC_(1, 'scala', 'java') FROM src LIMIT 1;\n" + 
"'scala'")
+case class Elt(children: Seq[Expression])
+  extends Expression with ExpectsInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, "elt requires at least one argument.")
--- End diff --

but then we would not be able to reuse ExpectsInputTypes?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r6077
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala ---
@@ -52,6 +52,23 @@ class DataTypeSuite extends SparkFunSuite {
 assert(StructField("b", LongType, false) === struct("b"))
   }
 
+  test("construct with add from StructField with comments") {
+// Test creation from StructField using four different ways
+val struct = (new StructType)
+  .add("a", "int", true, "test1")
+  .add("c", StringType, true, "test3")
--- End diff --

a very minor comment: can we name these fields `a, b, c, d` instead of `a, 
c, d, e`? The missing `b` is kind of annoying to me...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r6030
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -301,6 +302,7 @@ object FunctionRegistry {
 expression[UnBase64]("unbase64"),
 expression[Unhex]("unhex"),
 expression[Upper]("upper"),
+expression[XPathBoolean]("xpath_boolean"),
--- End diff --

hm let's not register the xml ones there yet.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r6029
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 ---
@@ -363,6 +363,31 @@ class DataFrameReaderWriterSuite extends QueryTest 
with SharedSQLContext with Be
 spark.range(10).write.orc(dir)
   }
 
+  test("column nullability and comment - write and then read") {
--- End diff --

This is from the original PR: https://github.com/apache/spark/pull/13764

In that PR, we only did the test case coverage for SQL interface. We 
removed the test cases for non-SQL interfaces.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r6009
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathExpressionSuite.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, 
Literal, NonFoldableLiteral}
+import org.apache.spark.sql.types.StringType
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Test suite for various xpath functions.
+ */
+class XPathExpressionSuite extends SparkFunSuite with ExpressionEvalHelper 
{
+
+  private def testBoolean[T](xml: String, path: String, expected: T): Unit 
= {
+checkEvaluation(
+  XPathBoolean(Literal.create(xml, StringType), Literal.create(path, 
StringType)),
+  expected)
+  }
+
+  test("xpath_boolean") {
+testBoolean("b", "a/b", true)
+testBoolean("b", "a/c", false)
+testBoolean("b", "a/b = \"b\"", true)
+testBoolean("b", "a/b = \"c\"", false)
+testBoolean("10", "a/b < 10", false)
+testBoolean("10", "a/b = 10", true)
+
+// null input
+testBoolean(null, null, null)
+testBoolean(null, "a", null)
+testBoolean("10", null, null)
+
+// exception handling for invalid input
+intercept[Exception] {
+  testBoolean("/a>", "a", null)
+}
+  }
+
+  test("xpath_boolean path cache invalidation") {
--- End diff --

The underlying implementation can still exploit that (e.g. Hive's 
implementation does it), so I'm thinking it might be useful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r68887957
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala ---
@@ -52,6 +52,23 @@ class DataTypeSuite extends SparkFunSuite {
 assert(StructField("b", LongType, false) === struct("b"))
   }
 
+  test("construct with add from StructField with comments") {
--- End diff --

ok let's keep it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13886: [SPARK-16185] [SQL] Better Error Messages When Creating ...

2016-06-28 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/13886
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r68887844
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala ---
@@ -52,6 +52,23 @@ class DataTypeSuite extends SparkFunSuite {
 assert(StructField("b", LongType, false) === struct("b"))
   }
 
+  test("construct with add from StructField with comments") {
--- End diff --

Since this PR also add two function calls `add` in `StructType`. This test 
case also covers the interface changes. 

I am fine if you think that test is useless. Let me know if you are OK to 
keep them. Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
For 3, I respect your opinion. I just make another commit for 2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68887731
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -301,6 +302,7 @@ object FunctionRegistry {
 expression[UnBase64]("unbase64"),
 expression[Unhex]("unhex"),
 expression[Upper]("upper"),
+expression[XPathBoolean]("xpath_boolean"),
--- End diff --

should we also register this function in `org.apache.spark.sql.functions`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68887674
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathExpressionSuite.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, 
Literal, NonFoldableLiteral}
+import org.apache.spark.sql.types.StringType
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Test suite for various xpath functions.
+ */
+class XPathExpressionSuite extends SparkFunSuite with ExpressionEvalHelper 
{
+
+  private def testBoolean[T](xml: String, path: String, expected: T): Unit 
= {
+checkEvaluation(
+  XPathBoolean(Literal.create(xml, StringType), Literal.create(path, 
StringType)),
+  expected)
+  }
+
+  test("xpath_boolean") {
+testBoolean("b", "a/b", true)
+testBoolean("b", "a/c", false)
+testBoolean("b", "a/b = \"b\"", true)
+testBoolean("b", "a/b = \"c\"", false)
+testBoolean("10", "a/b < 10", false)
+testBoolean("10", "a/b = 10", true)
+
+// null input
+testBoolean(null, null, null)
+testBoolean(null, "a", null)
+testBoolean("10", null, null)
+
+// exception handling for invalid input
+intercept[Exception] {
+  testBoolean("/a>", "a", null)
+}
+  }
+
+  test("xpath_boolean path cache invalidation") {
--- End diff --

do we still need to test it? there is no cache anymore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13919: [SPARK-16222] [SQL] JDBC Sources - Handling illegal inpu...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13919
  
**[Test build #61448 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61448/consoleFull)**
 for PR 13919 at commit 
[`b999b8a`](https://github.com/apache/spark/commit/b999b8a1474fc9d4f8d3a4c94694fbc40572111a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13906
  
**[Test build #61449 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61449/consoleFull)**
 for PR 13906 at commit 
[`4d937dc`](https://github.com/apache/spark/commit/4d937dc83da661b24d2af1dd513687f4a63b29b0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13966#discussion_r68887526
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -162,6 +162,42 @@ case class ConcatWs(children: Seq[Expression])
   }
 }
 
+@ExpressionDescription(
+  usage = "_FUNC_(n, str1, str2, ...) - returns the n-th string",
+  extended = "> SELECT _FUNC_(1, 'scala', 'java') FROM src LIMIT 1;\n" + 
"'scala'")
+case class Elt(children: Seq[Expression])
+  extends Expression with ExpectsInputTypes with CodegenFallback {
+
+  require(children.nonEmpty, "elt requires at least one argument.")
--- End diff --

we should use the expression type check framework to this, see `Coalesce` 
as an example


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13893
  
It's a good point, looks like we can also improve the `PushDownPredicate` 
rule according to this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13912
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13912
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61443/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13912
  
**[Test build #61443 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61443/consoleFull)**
 for PR 13912 at commit 
[`0e1901e`](https://github.com/apache/spark/commit/0e1901e6aeb47edc657ad11a0bea38d5a0f9c7f5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public final class JavaStructuredNetworkWordCount `
  * `class OptionUtils(object):`
  * `class DataFrameReader(OptionUtils):`
  * `class DataFrameWriter(OptionUtils):`
  * `class DataStreamReader(OptionUtils):`
  * `case class ShowFunctionsCommand(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68887194
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -301,6 +302,7 @@ object FunctionRegistry {
 expression[UnBase64]("unbase64"),
 expression[Unhex]("unhex"),
 expression[Upper]("upper"),
+expression[XPathBoolean]("xpath_boolean"),
--- End diff --

done.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13912
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13912
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61444/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r68887161
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 ---
@@ -363,6 +363,31 @@ class DataFrameReaderWriterSuite extends QueryTest 
with SharedSQLContext with Be
 spark.range(10).write.orc(dir)
   }
 
+  test("column nullability and comment - write and then read") {
--- End diff --

hmmm, why do we add this test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68887178
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
+  @transient private var lastPathUtf8: UTF8String = null
--- End diff --

Great idea. Done!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13912: [SPARK-16216][SQL] CSV data source supports custom date ...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13912
  
**[Test build #61444 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61444/consoleFull)**
 for PR 13912 at commit 
[`d03e7a0`](https://github.com/apache/spark/commit/d03e7a0806691b2ad3290cbf7e16a771faf55af1).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13966
  
**[Test build #3139 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3139/consoleFull)**
 for PR 13966 at commit 
[`7cea3b1`](https://github.com/apache/spark/commit/7cea3b1b2a1d34c14515242477903db5b4e6fb84).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13860: [SPARK-16157] [SQL] Add New Methods for comments ...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13860#discussion_r68886986
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala ---
@@ -52,6 +52,23 @@ class DataTypeSuite extends SparkFunSuite {
 assert(StructField("b", LongType, false) === struct("b"))
   }
 
+  test("construct with add from StructField with comments") {
--- End diff --

Although more test is better, I think for this case we only need to test 
`withComment` and `getComment`, it's obvious that other `StructField` creation 
is calling `withComment`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13964
  
**[Test build #3140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3140/consoleFull)**
 for PR 13964 at commit 
[`bdd49aa`](https://github.com/apache/spark/commit/bdd49aad79c6109046195f1f2713283a947d61f3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13966
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13966
  
cc @dongjoon-hyun @cloud-fan @rxin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13966: [SPARK-16276][SQL] Implement elt SQL function

2016-06-28 Thread petermaxlee
GitHub user petermaxlee opened a pull request:

https://github.com/apache/spark/pull/13966

[SPARK-16276][SQL] Implement elt SQL function

## What changes were proposed in this pull request?
This patch implements the elt function, as it is implemented in Hive.

## How was this patch tested?
Added expression unit test in StringExpressionsSuite and end-to-end test in 
StringFunctionsSuite.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/petermaxlee/spark SPARK-16276

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13966.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13966


commit 7cea3b1b2a1d34c14515242477903db5b4e6fb84
Author: petermaxlee 
Date:   2016-06-29T05:19:53Z

[SPARK-16276][SQL] Implement elt SQL function




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-06-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/13893
  
Predicates should not be reordered if a condition contains 
non-deterministic parts, for example,  'rand() < 0.1 AND a=1' should not be 
reordered to 'a=1 AND rand() < 0.1' as the number of calls rand() will change 
and thus output different rows.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-06-28 Thread gurvindersingh
Github user gurvindersingh commented on a diff in the pull request:

https://github.com/apache/spark/pull/13950#discussion_r68886506
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -127,7 +128,14 @@ private[deploy] class Master(
 logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
 webUi = new MasterWebUI(this, webUiPort)
 webUi.bind()
-masterWebUiUrl = "http://; + masterPublicAddress + ":" + 
webUi.boundPort
+if (reverseProxy) {
+  masterWebUiUrl = conf.get("spark.ui.reverseProxyUrl", null)
+  if (masterWebUiUrl == null) {
+   throw new SparkException("spark.ui.reverseProxyUrl must be 
provided")
--- End diff --

Updated the code now to remove the exception and use public address as 
default and if reverseproxyURL is given then override it. Should solve the 
issue you seeing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13950: [SPARK-15487] [Web UI] Spark Master UI to reverse...

2016-06-28 Thread gurvindersingh
Github user gurvindersingh commented on a diff in the pull request:

https://github.com/apache/spark/pull/13950#discussion_r68886448
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -127,7 +128,14 @@ private[deploy] class Master(
 logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
 webUi = new MasterWebUI(this, webUiPort)
 webUi.bind()
-masterWebUiUrl = "http://; + masterPublicAddress + ":" + 
webUi.boundPort
+if (reverseProxy) {
+  masterWebUiUrl = conf.get("spark.ui.reverseProxyUrl", null)
--- End diff --

It is used in case when you are running spark master itself behind a proxy 
e.g. Oauth2 to provide authentication/authorization. Its to make sure "Back to 
Master"  link works when you are on workers UI.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68886429
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

Sorry I read the code wrong. You are testing cache invalidation and this 
must be mutable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13948: [SPARK-16259] [PYSPARK] cleanup options in DataFrame rea...

2016-06-28 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13948
  
Feel free to do it - but please take another careful look before you cherry 
pick.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-28 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/13933
  
@gatorsmile parquet, json or other file formats support both `path` and 
`paths` options. So that's not a problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68886317
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
+  @transient private var lastPathUtf8: UTF8String = null
--- End diff --

then it's more obvious that we are tring to optimize when the path is 
literal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68886279
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
+  @transient private var lastPathUtf8: UTF8String = null
--- End diff --

how about
```
@transient lazy val pathLiteral: String = path match {
  case Literal(str: String) => str
  case _ => null
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
By the way, for complexity, it's 23 line optimizer without blank/comments. 
In fact, it's less than `NullPropagation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13948: [SPARK-16259] [PYSPARK] cleanup options in DataFrame rea...

2016-06-28 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/13948
  
@rxin why don't we merge this one to 2.0?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13955: [SPARK-16266][SQL][STREAING] Moved DataStreamRead...

2016-06-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13955


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
It sounds promising. Maybe, Spark 2.1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13955: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writ...

2016-06-28 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/13955
  
LGTM. Merging to master and 2.0. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
Sounds interesting. You mean `LocalNode` that computes all local node 
operators on `LocalRelation`, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13893
  
no, the predicates order doesn't matter. Our optimizer can reorder the 
predicates to run them more efficient.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13704#discussion_r68885567
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -837,8 +837,36 @@ case class Cast(child: Expression, dataType: DataType) 
extends UnaryExpression w
 val j = ctx.freshName("j")
 val values = ctx.freshName("values")
 
+val isPrimitiveFrom = ctx.isPrimitiveType(fromType)
--- End diff --

we need to make sure the input array's element nullability is false, but 
primitive type array doesn't guarantee it. e.g. we can have 
`ArrayType(ByteType, true)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13906
  
I'm not sure if this optimization is useful

1. empty `LocalRelation` is a corner case and seems not worth to optimize.
2. the optimization rule in this PR is kind of complex.
3. if we have better handling for `LocalRelation` in the futuren(like the 
`LocalNode`), this rule will become useless.

cc @marmbrus @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-06-28 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/13893
  
@cloud-fan I pushed a commit to apply predicate pushdown on deterministic 
parts placed before any non-deterministic predicates, should it be safe to do 
this optimization?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884892
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -301,6 +302,7 @@ object FunctionRegistry {
 expression[UnBase64]("unbase64"),
 expression[Unhex]("unhex"),
 expression[Upper]("upper"),
+expression[XPathBoolean]("xpath_boolean"),
--- End diff --

What about excluding at HiveSessionCatalog, too?


https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala#L230


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13965: [SPARK-16236] [SQL] [FOLLOWUP] Add Path Option back to L...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13965
  
**[Test build #61447 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61447/consoleFull)**
 for PR 13965 at commit 
[`9cfd673`](https://github.com/apache/spark/commit/9cfd67350670ef668781cf498597612713cba628).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13965: [SPARK-16236] [SQL] [FOLLOWUP] Add Path Option ba...

2016-06-28 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/13965

[SPARK-16236] [SQL] [FOLLOWUP] Add Path Option back to Load API in 
DataFrameReader

 What changes were proposed in this pull request?
When users only specify one and only one path, we use `options` to record 
the path value in `DataFrameReader`. For example, users can see the `path` 
option after the following API call,
```SQL
spark.read.parquet("/test")
```

In Python API, we have the same issue. Thanks for identifying this issue, 
@zsxwing ! Below is an example:
```Python
spark.read.format('json').load('python/test_support/sql/people.json')
```
 How was this patch tested?
Existing test cases cover the changes by this PR

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark optionPaths

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13965


commit 9cfd67350670ef668781cf498597612713cba628
Author: gatorsmile 
Date:   2016-06-29T04:38:57Z

fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884564
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/XmlFunctionsSuite.scala ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+/**
+ * End-to-end tests for XML expressions.
+ */
+class XmlFunctionsSuite extends QueryTest with SharedSQLContext {
+
+  test("xpath_boolean") {
+val input = "b"
+val path = "a/b"
--- End diff --

I've updated this. PTAL.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884371
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/XmlFunctionsSuite.scala ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+/**
+ * End-to-end tests for XML expressions.
+ */
+class XmlFunctionsSuite extends QueryTest with SharedSQLContext {
+
+  test("xpath_boolean") {
+val input = "b"
+val path = "a/b"
--- End diff --

will do.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884363
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

What do you mean?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884303
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/XmlFunctionsSuite.scala ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+/**
+ * End-to-end tests for XML expressions.
+ */
+class XmlFunctionsSuite extends QueryTest with SharedSQLContext {
+
+  test("xpath_boolean") {
+val input = "b"
+val path = "a/b"
--- End diff --

for end-to-end test, I think it's better to use attribute as input, not 
literal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884274
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/XmlFunctionsSuite.scala ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+/**
+ * End-to-end tests for XML expressions.
+ */
+class XmlFunctionsSuite extends QueryTest with SharedSQLContext {
+
+  test("xpath_boolean") {
+val input = "b"
+val path = "a/b"
--- End diff --

how about
```
val df = Seq("b" -> "a/b").toDF("xml", "path")
checkAnswer(df.select(expr("xpath_boolean(xml, path)")), Row(true))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11863
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61436/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13961: [SPARK-16271][SQL] Implement Hive's UDFXPathUtil

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13961
  
**[Test build #3137 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3137/consoleFull)**
 for PR 13961 at commit 
[`90bf2f1`](https://github.com/apache/spark/commit/90bf2f1ac93c6f83a028edfbc79cf956777f205a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ReusableStringReaderSuite extends SparkFunSuite `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884185
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathExpressionSuite.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, 
Literal, NonFoldableLiteral}
+import org.apache.spark.sql.types.StringType
+import org.apache.spark.unsafe.types.UTF8String
+
+/**
+ * Test suite for various xpath functions.
+ */
+class XPathExpressionSuite extends SparkFunSuite with ExpressionEvalHelper 
{
--- End diff --

I wrote this one based on what's already in the code base for other 
expressions. Let me know if I should do anything else.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11863
  
**[Test build #61436 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61436/consoleFull)**
 for PR 11863 at commit 
[`b1eec57`](https://github.com/apache/spark/commit/b1eec577d64f82784afaf626ad5a325bc7a1d555).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884156
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
--- End diff --

I think a literal path is a very common case, but a literal xml is fairly 
unlikely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61441/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884143
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

It's ok if we do save a lot of effort because of it, but seems we don't?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13778
  
**[Test build #61441 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61441/consoleFull)**
 for PR 13778 at commit 
[`65a33b0`](https://github.com/apache/spark/commit/65a33b05eaeef8454f8746313075163e21f73c8f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884058
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

Oh, I agree.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13963
  
**[Test build #61445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61445/consoleFull)**
 for PR 13963 at commit 
[`a314e56`](https://github.com/apache/spark/commit/a314e56457d8f6949b7d7463882e98127c24b680).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13963
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68884017
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
--- End diff --

shall we also optimize when the xml string is literal but the path string 
is not?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13963
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61445/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68883944
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/XPathBoolean.scala
 ---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions.xml
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types.{AbstractDataType, BooleanType, 
DataType, StringType}
+import org.apache.spark.unsafe.types.UTF8String
+
+
+@ExpressionDescription(
+  usage = "_FUNC_(xml, xpath) - Evaluates a boolean xpath expression.",
+  extended = "> SELECT _FUNC_('1','a/b');\ntrue")
+case class XPathBoolean(xml: Expression, path: Expression)
+  extends BinaryExpression with ExpectsInputTypes with CodegenFallback {
+
+  @transient private lazy val xpathUtil = new UDFXPathUtil
+
+  // We use these to avoid converting the path from UTF8String to String 
if it is a constant.
--- End diff --

shall we also optimize for literal xml string?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68883895
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

I thought this should be OK since the literal is non foldable and this 
class is only in the testing package.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13778
  
@cloud-fan @vlad17 Is this change good for your now? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13964#discussion_r68883797
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/NonFoldableLiteral.scala
 ---
@@ -26,7 +26,7 @@ import org.apache.spark.sql.types._
  * A literal value that is not foldable. Used in expression codegen 
testing to test code path
  * that behave differently based on foldable values.
  */
-case class NonFoldableLiteral(value: Any, dataType: DataType) extends 
LeafExpression {
+case class NonFoldableLiteral(var value: Any, dataType: DataType) extends 
LeafExpression {
--- End diff --

Ur, it seems to be for a testing purpose. Is it okay?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13930: [SPARK-16228][SQL] HiveSessionCatalog should return `dou...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13930
  
**[Test build #61446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61446/consoleFull)**
 for PR 13930 at commit 
[`b8df028`](https://github.com/apache/spark/commit/b8df0284aa7bd4328ff7f8e1ebdce55272e549d2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13930: [SPARK-16228][SQL] HiveSessionCatalog should return `dou...

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13930
  
Rebased to the master for https://github.com/apache/spark/pull/13939 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61435/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13778
  
**[Test build #61435 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61435/consoleFull)**
 for PR 13778 at commit 
[`1583fe3`](https://github.com/apache/spark/commit/1583fe3380ad3eef8f75d7709b9769e7d4e11477).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public final class JavaStructuredNetworkWordCount `
  * `final class Binarizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override 
val uid: String)`
  * `final class Bucketizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override 
val uid: String)`
  * `final class ChiSqSelector @Since(\"1.6.0\") (@Since(\"1.6.0\") 
override val uid: String)`
  * `class CountVectorizer @Since(\"1.5.0\") (@Since(\"1.5.0\") override 
val uid: String)`
  * `class CountVectorizerModel(`
  * `class DCT @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: 
String)`
  * `class ElementwiseProduct @Since(\"1.4.0\") (@Since(\"1.4.0\") override 
val uid: String)`
  * `class HashingTF @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: 
String)`
  * `final class IDF @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: 
String)`
  * `class Interaction @Since(\"1.6.0\") (@Since(\"1.6.0\") override val 
uid: String) extends Transformer`
  * `class MaxAbsScaler @Since(\"2.0.0\") (@Since(\"2.0.0\") override val 
uid: String)`
  * `class MinMaxScaler @Since(\"1.5.0\") (@Since(\"1.5.0\") override val 
uid: String)`
  * `class NGram @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: 
String)`
  * `class Normalizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override val 
uid: String)`
  * `class OneHotEncoder @Since(\"1.4.0\") (@Since(\"1.4.0\") override val 
uid: String) extends Transformer`
  * `class PCA @Since(\"1.5.0\") (`
  * `class PolynomialExpansion @Since(\"1.4.0\") (@Since(\"1.4.0\") 
override val uid: String)`
  * `final class QuantileDiscretizer @Since(\"1.6.0\") (@Since(\"1.6.0\") 
override val uid: String)`
  * `class RFormula @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: 
String)`
  * `class SQLTransformer @Since(\"1.6.0\") (@Since(\"1.6.0\") override val 
uid: String) extends Transformer`
  * `class StandardScaler @Since(\"1.4.0\") (`
  * `class StopWordsRemover @Since(\"1.5.0\") (@Since(\"1.5.0\") override 
val uid: String)`
  * `class StringIndexer @Since(\"1.4.0\") (`
  * `class Tokenizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: 
String)`
  * `class RegexTokenizer @Since(\"1.4.0\") (@Since(\"1.4.0\") override val 
uid: String)`
  * `class VectorAssembler @Since(\"1.4.0\") (@Since(\"1.4.0\") override 
val uid: String)`
  * `class VectorIndexer @Since(\"1.4.0\") (`
  * `final class VectorSlicer @Since(\"1.5.0\") (@Since(\"1.5.0\") override 
val uid: String)`
  * `final class Word2Vec @Since(\"1.4.0\") (`
  * `public class JavaPackage `
  * `class OptionUtils(object):`
  * `class DataFrameReader(OptionUtils):`
  * `class DataFrameWriter(OptionUtils):`
  * `class DataStreamReader(OptionUtils):`
  * `case class ShowFunctionsCommand(`
  * `case class StreamingRelationExec(sourceName: String, output: 
Seq[Attribute]) extends LeafExecNode `
  * `class TextSocketSource(host: String, port: Int, sqlContext: 
SQLContext)`
  * `class TextSocketSourceProvider extends StreamSourceProvider with 
DataSourceRegister with Logging `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13964
  
**[Test build #3138 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3138/consoleFull)**
 for PR 13964 at commit 
[`34cda07`](https://github.com/apache/spark/commit/34cda070f62677b6920174cc976107456172aeab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13964
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13964
  
cc @srowen @cloud-fan @vanzin @squito 

If this one works, I can implement the other ones too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13964
  
cc @srowen @cloud-fan @vanzin @squito 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13964: [SPARK-16274][SQL] Implement xpath_boolean

2016-06-28 Thread petermaxlee
GitHub user petermaxlee opened a pull request:

https://github.com/apache/spark/pull/13964

[SPARK-16274][SQL] Implement xpath_boolean

## What changes were proposed in this pull request?
This patch implements xpath_boolean expression for Spark SQL, a xpath 
function that returns true or false. The implementation is modelled after 
Hive's xpath_boolean, except that how the expression handles null inputs. Hive 
throws a NullPointerException at runtime if either of the input is null. This 
implementation returns null if either of the input is null.

## How was this patch tested?
Added unit tests for expressions (based on Hive's tests and some I added 
myself) and an end-to-end test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/petermaxlee/spark SPARK-16274

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13964.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13964


commit 34cda070f62677b6920174cc976107456172aeab
Author: petermaxlee 
Date:   2016-06-29T04:09:57Z

[SPARK-16274][SQL] Implement xpath_boolean




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13961: [SPARK-16271][SQL] Implement Hive's UDFXPathUtil

2016-06-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13961


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
Thank you, @cloud-fan .
It seems to be a good idea to handle operators on LocalRelations. 
But, if possible, may I dig that on another PR? :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13961: [SPARK-16271][SQL] Implement Hive's UDFXPathUtil

2016-06-28 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13961
  
Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13494: [SPARK-15752] [SQL] support optimization for meta...

2016-06-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13494#discussion_r68882374
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/MetadataOnlyOptimizer.scala
 ---
@@ -0,0 +1,171 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, SparkSession}
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.catalog.{CatalogRelation, 
SessionCatalog}
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate._
+import org.apache.spark.sql.catalyst.plans.logical._
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.execution.datasources.{HadoopFsRelation, 
LogicalRelation}
+
+/**
+ * When scanning only partition columns, get results based on metadata 
without scanning files.
+ * It is used for distinct, distinct aggregations or distinct-like 
aggregations(example: Max/Min).
+ * First of all, scanning only partition columns are required, then the 
rule does the following
+ * things here:
+ * 1. aggregate expression is partition columns,
+ *  e.g. SELECT col FROM tbl GROUP BY col or SELECT col FROM tbl GROUP BY 
cube(col).
+ * 2. aggregate function on partition columns with DISTINCT,
+ *  e.g. SELECT count(DISTINCT col) FROM tbl GROUP BY col.
+ * 3. aggregate function on partition columns which have same result with 
DISTINCT keyword.
+ *  e.g. SELECT Max(col2) FROM tbl GROUP BY col1.
+ */
+case class MetadataOnlyOptimizer(
+sparkSession: SparkSession,
+catalog: SessionCatalog) extends Rule[LogicalPlan] {
+
+  private def canSupportMetadataOnly(a: Aggregate): Boolean = {
+val aggregateExpressions = a.aggregateExpressions.flatMap { expr =>
+  expr.collect {
+case agg: AggregateExpression => agg
+  }
+}.distinct
+if (aggregateExpressions.isEmpty) {
+  // Support for aggregate that has no aggregateFunction when 
expressions are partition columns
+  // example: select partitionCol from table group by partitionCol.
+  // Moreover, multiple-distinct has been rewritted into it by 
RewriteDistinctAggregates.
+  true
+} else {
+  aggregateExpressions.forall { agg =>
+if (agg.isDistinct) {
+  true
+} else {
+  // If function can be evaluated on just the distinct values of a 
column, it can be used
+  // by metadata-only optimizer.
+  agg.aggregateFunction match {
+case max: Max => true
+case min: Min => true
+case hyperLog: HyperLogLogPlusPlus => true
+case _ => false
+  }
+}
+  }
+}
+  }
+
+  private def convertLogicalToMetadataOnly(
+  project: LogicalPlan,
+  filter: Option[Expression],
+  logical: LogicalRelation,
+  files: HadoopFsRelation): LogicalPlan = {
+val attributeMap = logical.output.map(attr => (attr.name, attr)).toMap
+val partitionColumns = files.partitionSchema.map { field =>
+  attributeMap.getOrElse(field.name, throw new AnalysisException(
+s"Unable to resolve ${field.name} given 
[${logical.output.map(_.name).mkString(", ")}]"))
+}
+val projectSet = filter.map(project.references ++ 
_.references).getOrElse(project.references)
+if (projectSet.subsetOf(AttributeSet(partitionColumns))) {
+  val selectedPartitions = 
files.location.listFiles(filter.map(Seq(_)).getOrElse(Seq.empty))
+  val valuesRdd = 
sparkSession.sparkContext.parallelize(selectedPartitions.map(_.values), 1)
+  val valuesPlan = LogicalRDD(partitionColumns, 
valuesRdd)(sparkSession)
+  valuesPlan
+} else {
+  logical
+}
+  }
+
+  private def convertCatalogToMetadataOnly(
+  project: 

[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13963
  
**[Test build #61445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61445/consoleFull)**
 for PR 13963 at commit 
[`a314e56`](https://github.com/apache/spark/commit/a314e56457d8f6949b7d7463882e98127c24b680).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13963: [TRIVIAL][PYSPARK] Clean up orc compression optio...

2016-06-28 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/13963

[TRIVIAL][PYSPARK] Clean up orc compression option as well

## What changes were proposed in this pull request?

This PR corrects ORC compression option for PySpark as well. I think this 
was missed in https://github.com/apache/spark/pull/13948.

## How was this patch tested?

N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark minor-orc-compress

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13963.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13963


commit a314e56457d8f6949b7d7463882e98127c24b680
Author: hyukjinkwon 
Date:   2016-06-29T03:54:30Z

Clean up orc compression option as well




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13963: [TRIVIAL][PYSPARK] Clean up orc compression option as we...

2016-06-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13963
  
cc @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >