date:20160710

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14034
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14034
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62073/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14131: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14131
  
**[Test build #62076 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62076/consoleFull)**
 for PR 14131 at commit 
[`4d6f654`](https://github.com/apache/spark/commit/4d6f6544be4373a32150fd6d59ba539d3fcb6aab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14034
  
**[Test build #62073 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62073/consoleFull)**
 for PR 14034 at commit 
[`2e6f8d8`](https://github.com/apache/spark/commit/2e6f8d8c8b5007302415b7fd984a38fc51be44bf).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14114#discussion_r70204413
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -246,9 +246,27 @@ class SessionCatalog(
   def getTableMetadata(name: TableIdentifier): CatalogTable = {
--- End diff --

Yep.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14131: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14131
  
cc @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14114#discussion_r70204393
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -425,10 +443,11 @@ class SessionCatalog(
   def tableExists(name: TableIdentifier): Boolean = synchronized {
--- End diff --

Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
Here it is for branch-2.0.
https://github.com/apache/spark/pull/14131


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14131: [SPARK-16318][SQL] Implement all remaining xpath ...

2016-07-10 Thread petermaxlee

GitHub user petermaxlee opened a pull request:

https://github.com/apache/spark/pull/14131

[SPARK-16318][SQL] Implement all remaining xpath functions (branch-2.0)

## What changes were proposed in this pull request?
This patch implements all remaining xpath functions that Hive supports and 
not natively supported in Spark: xpath_int, xpath_short, xpath_long, 
xpath_float, xpath_double, xpath_string, and xpath.

This is based on https://github.com/apache/spark/pull/13991 but for 
branch-2.0.

## How was this patch tested?
Added unit tests and end-to-end tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/petermaxlee/spark xpath-branch-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14131


commit 4d6f6544be4373a32150fd6d59ba539d3fcb6aab
Author: petermaxlee 
Date:   2016-07-11T05:28:34Z

[SPARK-16318][SQL] Implement all remaining xpath functions

This patch implements all remaining xpath functions that Hive supports and 
not natively supported in Spark: xpath_int, xpath_short, xpath_long, 
xpath_float, xpath_double, xpath_string, and xpath.

Added unit tests and end-to-end tests.

Author: petermaxlee 

Closes #13991 from petermaxlee/SPARK-16318.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-10 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/14112
  
@hhbyyh I think offline test should be OK for now, since we don't have 
unified save/load compatibility test framework until now. It's better we can 
get this feature in the next RC.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70204050
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

`CallMethodUsingReflect`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
@cloud-fan thanks for merging!

@yhuai I think the degree to which we want to add more tests also depend on 
how much we trust the library we are using. XPath (with Query) is almost as 
complicated as SQL itself.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13991: [SPARK-16318][SQL] Implement all remaining xpath ...

2016-07-10 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13991


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13991
  
Thanks, merging to master! This doesn't merge clearly to 2.0, @petermaxlee 
can you submit a new PR against 2.0? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13969
  
You can also remove half of the test cases.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13969
  
One thing ... it might be better to remove the ability to call non-static 
methods. At least to me it'd make the things slightly simpler and more clear. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/13991
  
OK. Thanks. Then, it will be good to add more tests for cases that are not 
covered by those hive tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11317: [SPARK-12639] [SQL] Mark Filters Fully Handled By Source...

2016-07-10 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/11317
  
tes this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
BTW I have to say Hive's test coverage in this area is very spotty, so I 
don't actually think it's great to follow, but I used those.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
Actually I created the unit tests based on those.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...

2016-07-10 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14114#discussion_r70203557
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -246,9 +246,27 @@ class SessionCatalog(
   def getTableMetadata(name: TableIdentifier): CatalogTable = {
--- End diff --

same thing here - update SessionCatalogSuite.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/13991
  
As a follow-up task. Can you take a look at the following query files and 
add useful tests in your test? Thanks.
```

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/describe_xpath.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath2.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath3.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/input_testxpath4.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_boolean.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_double.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_float.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_int.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_long.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_short.q

.//sql/hive/src/test/resources/ql/src/test/queries/clientpositive/udf_xpath_string.q
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14114: [SPARK-16458][SQL] SessionCatalog should support ...

2016-07-10 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14114#discussion_r70203553
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -425,10 +443,11 @@ class SessionCatalog(
   def tableExists(name: TableIdentifier): Boolean = synchronized {
--- End diff --

can you update SessionCatalogSuite to reflect this behavior? I think we 
weren't checking temp tables in the past.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70203510
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName, 
argExprs.map(_.dataType)).orNull

[GitHub] spark issue #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13969
  
what's hive's behaviour if calling a non-static method but the class 
doesn't have no-arg constructor? null or exception?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12414
  
**[Test build #62074 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62074/consoleFull)**
 for PR 12414 at commit 
[`167beae`](https://github.com/apache/spark/commit/167beae592d084ead74d2361dcc3a19d0d53d60b).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14120: [SPARK-16199][SQL] Add a method to list the referenced c...

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14120
  
This can be updated once #14130 is merged.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70203430
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class MiscFunctionsSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  test("reflect and java_method") {
+val df = Seq((1, "one")).toDF("a", "b")
+checkAnswer(
+  df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 
'method1', a, b)"),
--- End diff --

oh i see, `reflect('org.apache.spark.sql.ReflectClass', 'method1', a, b)` 
is not equal to `ReflectClass.method1`, but calling the static method defined 
in the `ReflectClass` class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12414
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62074/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12414
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14130
  
**[Test build #62075 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62075/consoleFull)**
 for PR 14130 at commit 
[`2915bf1`](https://github.com/apache/spark/commit/2915bf1e79a28a1df0fbc895068fcf8bee2095b0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSH...

2016-07-10 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/14130

[SPARK-16477] Bump master version to 2.1.0-SNAPSHOT

## What changes were proposed in this pull request?
After SPARK-16476, we can finally bump the version number.

## How was this patch tested?
N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-16477

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14130


commit 2915bf1e79a28a1df0fbc895068fcf8bee2095b0
Author: Reynold Xin 
Date:   2016-07-11T05:10:43Z

[SPARK-16477] Bump master version to 2.1.0-SNAPSHOT




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14130: [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14130
  
cc @srowen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70202991
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class MiscFunctionsSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  test("reflect and java_method") {
+val df = Seq((1, "one")).toDF("a", "b")
+checkAnswer(
+  df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 
'method1', a, b)"),
--- End diff --

You should decompile the right class (don't decompile the one with a dollar 
sign). Static methods are generated too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12414: [SPARK-14657] [ML] RFormula w/o intercept should output ...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12414
  
**[Test build #62074 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62074/consoleFull)**
 for PR 12414 at commit 
[`167beae`](https://github.com/apache/spark/commit/167beae592d084ead74d2361dcc3a19d0d53d60b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70202906
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class MiscFunctionsSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  test("reflect and java_method") {
+val df = Seq((1, "one")).toDF("a", "b")
+checkAnswer(
+  df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 
'method1', a, b)"),
--- End diff --

no, method defined in companion object is not static method, but a normal 
method defined in a singleton class. You can decompile the class file to check 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13704
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62071/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13704
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13704
  
**[Test build #62071 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62071/consoleFull)**
 for PR 13704 at commit 
[`66800fa`](https://github.com/apache/spark/commit/66800faaebf72e492ee7693d81f8dba980f1dab2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-10 Thread NarineK

Github user NarineK commented on a diff in the pull request:

https://github.com/apache/spark/pull/14090#discussion_r70202736
  
--- Diff: docs/sparkr.md ---
@@ -306,6 +306,64 @@ head(ldf, 3)
 {% endhighlight %}
 
 
+ Run a given function on a large dataset grouping by input column(s) 
and using `gapply` or `gapplyCollect`
+
+# gapply
+Apply a function to each group of a `SparkDataFrame`. The function is to 
be applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
+that key. The groups are chosen from `SparkDataFrame`s column(s).
+The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
+`SparkDataFrame`. It must match the R function's output.
--- End diff --

Thanks, I was looking at types.R file and have noticed that we have NA's 
for array, map and struct.
https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L42
But I guess in our case we can have: array, map and struct mapped to array, 
map and struct correspondingly ?!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14128: [SPARK-16476] Restructure MimaExcludes for easier...

2016-07-10 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14128


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14128: [SPARK-16476] Restructure MimaExcludes for easier union ...

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14128
  
I'm going to merge this since this is a simple formatting change. I will 
submit a patch that updates the pom files and show what this can do.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14128: [SPARK-16476] Restructure MimaExcludes for easier union ...

2016-07-10 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14128
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70202613
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName,

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13991
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62070/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13991
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-10 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/14090#discussion_r70202560
  
--- Diff: docs/sparkr.md ---
@@ -306,6 +306,64 @@ head(ldf, 3)
 {% endhighlight %}
 
 
+ Run a given function on a large dataset grouping by input column(s) 
and using `gapply` or `gapplyCollect`
+
+# gapply
+Apply a function to each group of a `SparkDataFrame`. The function is to 
be applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
+that key. The groups are chosen from `SparkDataFrame`s column(s).
+The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
+`SparkDataFrame`. It must match the R function's output.
--- End diff --

This looks good to me ! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13991
  
**[Test build #62070 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62070/consoleFull)**
 for PR 13991 at commit 
[`0c60d87`](https://github.com/apache/spark/commit/0c60d87c0dd1b7e78fd77c2f01b67a2ae8a0151e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ParseUrl(children: Seq[Expression])`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-10 Thread NarineK

Github user NarineK commented on a diff in the pull request:

https://github.com/apache/spark/pull/14090#discussion_r70202321
  
--- Diff: docs/sparkr.md ---
@@ -306,6 +306,64 @@ head(ldf, 3)
 {% endhighlight %}
 
 
+ Run a given function on a large dataset grouping by input column(s) 
and using `gapply` or `gapplyCollect`
+
+# gapply
+Apply a function to each group of a `SparkDataFrame`. The function is to 
be applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
+that key. The groups are chosen from `SparkDataFrame`s column(s).
+The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
+`SparkDataFrame`. It must match the R function's output.
--- End diff --

Thanks @shivaram.
Does the following mapping looks fine to have in the table ?
```
**R   Spark**
byte  byte
integer  integer
float  float
double  double
numericdouble
character  string
stringstring
binary   binary
raw   binary
logical   boolean
timestamptimestamp
date  date
array array
map  map
structstruct
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14034
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62069/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14034
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14034
  
**[Test build #62069 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62069/consoleFull)**
 for PR 14034 at commit 
[`dec5ad9`](https://github.com/apache/spark/commit/dec5ad95bdd003fe58e92d1245388fa4758d8f49).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-10 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/14090#discussion_r70202064
  
--- Diff: docs/sparkr.md ---
@@ -306,6 +306,64 @@ head(ldf, 3)
 {% endhighlight %}
 
 
+ Run a given function on a large dataset grouping by input column(s) 
and using `gapply` or `gapplyCollect`
+
+# gapply
+Apply a function to each group of a `SparkDataFrame`. The function is to 
be applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
+that key. The groups are chosen from `SparkDataFrame`s column(s).
+The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
+`SparkDataFrame`. It must match the R function's output.
--- End diff --

Yeah but instead of a pointer to the code it would be great if we could 
have a table in the documentation. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13704#discussion_r70202042
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2018,6 +2018,8 @@ class Analyzer(
 fail(child, DateType, walkedTypePath)
   case (StringType, to: NumericType) =>
 fail(child, to, walkedTypePath)
+  case (from: ArrayType, to: ArrayType) if !from.containsNull =>
--- End diff --

I mean MapType. It's similar to ArrayType, the value of it can be nullable 
or non-nullable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14034
  
Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
I guess whatever generates that message is buggy?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14034
  
LGTM, pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201874
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class MiscFunctionsSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  test("reflect and java_method") {
+val df = Seq((1, "one")).toDF("a", "b")
+checkAnswer(
+  df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 
'method1', a, b)"),
--- End diff --

I don't get what you mean. Scala does have static methods -- methods that 
are defined in a companion object is static.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201875
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName, 
argExprs.map(_.dataType)).orNull

[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...

2016-07-10 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14123
  
@cloud-fan Yeah! Will be in [WIP] until 
https://github.com/apache/spark/pull/14071 is merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201841
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

So what's a good name? I am not attached to Reflect, but I think Reflect 
should be in the name, if the function is called reflect.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14123
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201786
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName,

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13991
  
The latest result is `This patch does not merge cleanly.`, I just wanna 
double check it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14123
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62068/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201691
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/MiscFunctionsSuite.scala ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class MiscFunctionsSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  test("reflect and java_method") {
+val df = Seq((1, "one")).toDF("a", "b")
+checkAnswer(
+  df.selectExpr("reflect('org.apache.spark.sql.ReflectClass', 
'method1', a, b)"),
--- End diff --

We should also test it in `JavaDataFrameSuite`, there is no real static 
method in scala.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14123: [SPARK-16471] [SQL] Remove Hive-specific CreateHiveTable...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14123
  
**[Test build #62068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62068/consoleFull)**
 for PR 14123 at commit 
[`082040f`](https://github.com/apache/spark/commit/082040f64130795593d551647f4d451a0b6a9a7e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201642
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName, 
argExprs.map(_.dataType)).orNull

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201559
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
+  buffer(i) = argExprs(i).eval(input).asInstanceOf[Object]
+  // Convert if necessary. Based on the types defined in typeMapping, 
string is the only
+  // type that needs conversion. If we support timestamps, dates, 
decimals, arrays, or maps
+  // in the future, proper conversion needs to happen here too.
+  if (buffer(i).isInstanceOf[UTF8String]) {
+buffer(i) = buffer(i).toString
+  }
+  i += 1
+}
+UTF8String.fromString(String.valueOf(method.invoke(obj, buffer : _*)))
+  }
+
+  @transient private lazy val argExprs: Array[Expression] = 
children.drop(2).toArray
+
+  /** Name of the class -- this has to be called after we verify children 
has at least two exprs. */
+  @transient private lazy val className = 
children(0).eval().asInstanceOf[UTF8String].toString
+
+  /** True if the class exists and can be loaded. */
+  @transient private lazy val classExists = Reflect.classExists(className)
+
+  /** The reflection method. */
+  @transient lazy val method: Method = {
+val methodName = 
children(1).eval(null).asInstanceOf[UTF8String].toString
+Reflect.findMethod(className, methodName, 
argExprs.map(_.dataType)).orNull

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70201417
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

Ya. It's my fault. Sorry for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13991: [SPARK-16318][SQL] Implement all remaining xpath functio...

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/13991
  
@cloud-fan Jenkins already ran twice successfully before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70200185
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

It is also annoying if we search for reflect (based on the name) and then 
doesn't find an expression with reflect in the name.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70200163
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

I actually named it JavaMethodReflect before but @dongjoon-hyun asked to 
use Reflect.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14081
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62072/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14081
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14034
  
**[Test build #62073 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62073/consoleFull)**
 for PR 14034 at commit 
[`2e6f8d8`](https://github.com/apache/spark/commit/2e6f8d8c8b5007302415b7fd984a38fc51be44bf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14081
  
**[Test build #62072 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62072/consoleFull)**
 for PR 14081 at commit 
[`81611a8`](https://github.com/apache/spark/commit/81611a860031064d482f2d3b2b67f5f4ed0648dd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13704: [SPARK-15985][SQL] Reduce runtime overhead of a p...

2016-07-10 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13704#discussion_r70199583
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2018,6 +2018,8 @@ class Analyzer(
 fail(child, DateType, walkedTypePath)
   case (StringType, to: NumericType) =>
 fail(child, to, walkedTypePath)
+  case (from: ArrayType, to: ArrayType) if !from.containsNull =>
--- End diff --

I will try improving the `SimplifyCasts` rule to force to eliminate the 
cast from non-element-nullable array to nullable ones. I do not understand the 
following. Will it be automatically done by improving the `SimplifyCasts` or do 
we need to improve another rule?
> "we can handle map too"

I will add unit tests for it. I think that it is good to add a benchmark to 
show degree of improvements.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...

2016-07-10 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/14081
  
I realized that the "PipelineExample"s are included in the docs, while the 
"SimpleTextClassificationExample"s are not, so it might be better to keep those 
instead.  I just changed the data and regularization value to that of 
"SimpleTextClassificationExample" which gives correct predictions (it looks 
like these examples were updated at one time by DB to fix this, but the change 
was not put into the doc example).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14114
  
Now, it's back for review again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70198925
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 
'randomUUID');\nc33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
+  extends Expression with CodegenFallback {
+
+  override def prettyName: String = "reflect"
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.size < 2) {
+  TypeCheckFailure("requires at least two arguments")
+} else if (!children.take(2).forall(e => e.dataType == StringType && 
e.foldable)) {
+  // The first two arguments must be string type.
+  TypeCheckFailure("first two arguments should be string literals")
+} else if (!classExists) {
+  TypeCheckFailure(s"class $className not found")
+} else if (method == null) {
+  TypeCheckFailure(s"cannot find a method that matches the argument 
types in $className")
+} else {
+  TypeCheckSuccess
+}
+  }
+
+  override def deterministic: Boolean = false
+  override def nullable: Boolean = true
+  override val dataType: DataType = StringType
+
+  override def eval(input: InternalRow): Any = {
+var i = 0
+while (i < argExprs.length) {
--- End diff --

`while` is preferred here. The `eval` method is critical path and `for` 
loop in scala in slow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14081: [SPARK-16403][Examples] Cleanup to remove unused imports...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14081
  
**[Test build #62072 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62072/consoleFull)**
 for PR 14081 at commit 
[`81611a8`](https://github.com/apache/spark/commit/81611a860031064d482f2d3b2b67f5f4ed0648dd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13969: [SPARK-16284][SQL] Implement reflect SQL function

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13969#discussion_r70198848
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Reflect.scala
 ---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.lang.reflect.Method
+
+import scala.util.Try
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{TypeCheckFailure, 
TypeCheckSuccess}
+import org.apache.spark.sql.catalyst.expressions.codegen.CodegenFallback
+import org.apache.spark.sql.types._
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+/**
+ * An expression that invokes a method on a class via reflection.
+ *
+ * For now, only types defined in `Reflect.typeMapping` are supported 
(basically primitives
+ * and string) as input types, and the output is turned automatically to a 
string.
+ *
+ * @param children the first element should be a literal string for the 
class name,
+ * and the second element should be a literal string for 
the method name,
+ * and the remaining are input arguments to the Java 
method.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(class,method[,arg1[,arg2..]]) calls method with 
reflection",
+  extended = "> SELECT _FUNC_('java.util.UUID', 'randomUUID');\n 
c33fb387-8500-4bfa-81d2-6e0e3e930df2")
+// scalastyle:on line.size.limit
+case class Reflect(children: Seq[Expression])
--- End diff --

`Reflect` is really ambiguous, how about `CallMethod`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14114
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62067/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...

2016-07-10 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14114
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14114: [SPARK-16458][SQL] SessionCatalog should support `listCo...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14114
  
**[Test build #62067 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62067/consoleFull)**
 for PR 14114 at commit 
[`af6692f`](https://github.com/apache/spark/commit/af6692fafda6429d87eff7decb8ec0fdabd036fd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14034
  
uh... Thanks! Let me do it now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14090: [SPARK-16112][SparkR] Programming guide for gappl...

2016-07-10 Thread NarineK

Github user NarineK commented on a diff in the pull request:

https://github.com/apache/spark/pull/14090#discussion_r70198331
  
--- Diff: docs/sparkr.md ---
@@ -306,6 +306,64 @@ head(ldf, 3)
 {% endhighlight %}
 
 
+ Run a given function on a large dataset grouping by input column(s) 
and using `gapply` or `gapplyCollect`
+
+# gapply
+Apply a function to each group of a `SparkDataFrame`. The function is to 
be applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
+that key. The groups are chosen from `SparkDataFrame`s column(s).
+The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
+`SparkDataFrame`. It must match the R function's output.
--- End diff --

or we could probably refer also to this ?
https://github.com/apache/spark/blob/master/R/pkg/R/types.R#L21


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-10 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14124
  
@viirya Thanks for your comment! Actually, that's I want to have some 
feedback for from @marmbrus .

It seems forcing to a nullable schema all is already happening when you 
read/write data via `read`/`write` API (but not for structured streaming and 
another API for json).

So, actually, the reason of this PR is, to make all consistent. The reason 
to make them consistent in a way that the schema is forced as nullable is what 
he said in the mailing list.

>Sure, but a traditional RDBMS has the opportunity to do validation before
>loading data in.  Thats not really an option when you are reading random
>files from S3.  This is why Hive and many other systems in this space treat
>all columns as nullable.

Actually, Parquet also reads and writes the schema with nullability 
correctly if we get rid of `asNullable` (I tested this before) but it seems 
that's prevented due to (I assume) the reason above.

@marmbrus Do you mind if I ask to clarify here please?

I think we may have to deal with this as datasource-specific problem.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of P...

2016-07-10 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/13778#discussion_r70198204
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -346,14 +346,47 @@ case class LambdaVariable(value: String, isNull: 
String, dataType: DataType) ext
 object MapObjects {
   private val curId = new java.util.concurrent.atomic.AtomicInteger()
 
+  /**
+   * Construct an instance of MapObjects case class.
+   *
+   * @param function The function applied on the collection elements.
+   * @param inputData An expression that when evaluated returns a 
collection object.
+   * @param elementType The data type of elements in the collection.
+   */
   def apply(
   function: Expression => Expression,
   inputData: Expression,
   elementType: DataType): MapObjects = {
 val loopValue = "MapObjects_loopValue" + curId.getAndIncrement()
 val loopIsNull = "MapObjects_loopIsNull" + curId.getAndIncrement()
 val loopVar = LambdaVariable(loopValue, loopIsNull, elementType)
-MapObjects(loopValue, loopIsNull, elementType, function(loopVar), 
inputData)
+MapObjects(loopValue, loopIsNull, elementType, function(loopVar), 
inputData, None)
+  }
+
+  /**
+   * Construct an instance of MapObjects case class.
+   *
+   * @param function The function applied on the collection elements.
+   * @param inputData An expression that when evaluated returns a 
collection object.
+   * @param elementType The data type of elements in the collection.
+   * @param inputDataType The explicitly given data type of inputData to 
override the
+   *  data type inferred from inputData (i.e., 
inputData.dataType).
+   *  When Python UDT whose sqlType is an array, the 
deserializer
+   *  expression will apply MapObjects on it. However, 
as the data type
+   *  of inputData is Python UDT, which is not an 
expected array type
+   *  in MapObjects. In this case, we need to 
explicitly use
+   *  Python UDT's sqlType as data type.
--- End diff --

Making sense. I will update it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13248: [SPARK-15194] [ML] Add Python ML API for Multivar...

2016-07-10 Thread lins05

Github user lins05 commented on a diff in the pull request:

https://github.com/apache/spark/pull/13248#discussion_r70198232
  
--- Diff: python/pyspark/ml/stat/distribution.py ---
@@ -0,0 +1,267 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from pyspark.ml.linalg import DenseVector, DenseMatrix, Vector
+import numpy as np
+
+__all__ = ['MultivariateGaussian']
+
+
+
+class MultivariateGaussian():
+"""
+This class provides basic functionality for a Multivariate Gaussian 
(Normal) Distribution. In
+ the event that the covariance matrix is singular, the density will be 
computed in a
+reduced dimensional subspace under which the distribution is supported.
+(see 
[[http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Degenerate_case]])
+
+mu The mean vector of the distribution
+sigma The covariance matrix of the distribution
+
+
+>>> mu = Vectors.dense([0.0, 0.0])
--- End diff --

I see, but the missing import of `Vectors` would fail the doctest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14048: [SPARK-16370][SQL] Union queries should not be ex...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun closed the pull request at:

https://github.com/apache/spark/pull/14048


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14048
  
Hmm. Okay, I didn't prevent all.
I see. I'll close.
Thank you for decision, @cloud-fan .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14048
  
```
case Union(children) if children.forall(x => 
x.isInstanceOf[InsertIntoTable] || 
x.isInstanceOf[InsertIntoHadoopFsRelationCommand]) =>
```
This doesn't indicate a muliti-insert right? A normal `Uion` can also looks 
like this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14048
  
Ah, I see. You mean Union of `INSERT INTO`s, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14048
  
This PR fixes that with minimal efforts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14034: [SPARK-16355] [SPARK-16354] [SQL] Fix Bugs When LIMIT/TA...

2016-07-10 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14034
  
you missed one comment : 
https://github.com/apache/spark/pull/14034/files#r70183958 :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13704: [SPARK-15985][SQL] Reduce runtime overhead of a program ...

2016-07-10 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13704
  
**[Test build #62071 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62071/consoleFull)**
 for PR 13704 at commit 
[`66800fa`](https://github.com/apache/spark/commit/66800faaebf72e492ee7693d81f8dba980f1dab2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14048
  
Ur, what do you mean?
> With your patch, we can still create union queries with side effect which 
will be executed eagerly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14048: [SPARK-16370][SQL] Union queries should not be executed ...

2016-07-10 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14048
  
The current one looks like this.
```
case Union(children) if children.forall(x => 
x.isInstanceOf[InsertIntoTable] || 
x.isInstanceOf[InsertIntoHadoopFsRelationCommand]) =>
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 >

1 - 100 of 320 matches

Mail list logo