[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9766


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-13 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83159493
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class using reflection, for use from pyspark
+   *
+   * @param name   udf name
+   * @param className   fully qualified class name of udf
+   * @param returnDataType  return type of udf. If it is null, spark would 
try to infer
+   *via reflection.
+   */
+  def registerJava(name: String, className: String, returnDataType: 
DataType): Unit = {
+
+try {
+  val clazz = Utils.classForName(className)
+  val udfInterfaces = clazz.getGenericInterfaces
+.filter(_.isInstanceOf[ParameterizedType])
+.map(_.asInstanceOf[ParameterizedType])
+.filter(e => e.getRawType.isInstanceOf[Class[_]] && 
e.getRawType.asInstanceOf[Class[_]].getCanonicalName.startsWith("org.apache.spark.sql.api.java.UDF"))
+  if (udfInterfaces.length == 0) {
+throw new IOException(s"UDF class ${className} doesn't implement 
any UDF interface")
+  } else if (udfInterfaces.length > 1) {
+throw new IOException(s"It is invalid to implement multiple UDF 
interfaces, UDF class ${className}")
+  } else {
+try {
+  val udf = clazz.newInstance()
+  val udfReturnType = udfInterfaces(0).getActualTypeArguments.last
+  var returnType = returnDataType
+  if (returnType == null) {
+if (udfReturnType.isInstanceOf[Class[_]]) {
+  returnType = 
udfReturnType.asInstanceOf[Class[_]].getCanonicalName match {
--- End diff --

Thanks for the hint, fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-13 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83159473
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class using reflection, for use from pyspark
+   *
+   * @param name   udf name
+   * @param className   fully qualified class name of udf
+   * @param returnDataType  return type of udf. If it is null, spark would 
try to infer
+   *via reflection.
+   */
+  def registerJava(name: String, className: String, returnDataType: 
DataType): Unit = {
--- End diff --

Fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-13 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83159425
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,32 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+@ignore_unicode_prefix
+@since(2.1)
+def registerJavaFunction(self, name, javaClassName, returnType=None):
+"""Register a java UDF so it can be used in SQL statements.
+
+In addition to a name and the function itself, the return type can 
be optionally specified.
+When the return type is not given it would infer the returnType 
via reflection.
--- End diff --

Fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83134970
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
--- End diff --

I can turn it on, but it would make the function less readable, especially 
for the following statements where it beyond line length limitation. 

```
case 14 => register(name, udf.asInstanceOf[UDF13[_, _, _, _, _, _, _, _, _, 
_, _, _, _, _]], returnType)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83057793
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class using reflection, for use from pyspark
+   *
+   * @param name   udf name
+   * @param className   fully qualified class name of udf
+   * @param returnDataType  return type of udf. If it is null, spark would 
try to infer
+   *via reflection.
+   */
+  def registerJava(name: String, className: String, returnDataType: 
DataType): Unit = {
+
+try {
+  val clazz = Utils.classForName(className)
+  val udfInterfaces = clazz.getGenericInterfaces
+.filter(_.isInstanceOf[ParameterizedType])
+.map(_.asInstanceOf[ParameterizedType])
+.filter(e => e.getRawType.isInstanceOf[Class[_]] && 
e.getRawType.asInstanceOf[Class[_]].getCanonicalName.startsWith("org.apache.spark.sql.api.java.UDF"))
+  if (udfInterfaces.length == 0) {
+throw new IOException(s"UDF class ${className} doesn't implement 
any UDF interface")
+  } else if (udfInterfaces.length > 1) {
+throw new IOException(s"It is invalid to implement multiple UDF 
interfaces, UDF class ${className}")
+  } else {
+try {
+  val udf = clazz.newInstance()
+  val udfReturnType = udfInterfaces(0).getActualTypeArguments.last
+  var returnType = returnDataType
+  if (returnType == null) {
+if (udfReturnType.isInstanceOf[Class[_]]) {
+  returnType = 
udfReturnType.asInstanceOf[Class[_]].getCanonicalName match {
--- End diff --

Can we use 
[`JavaTypeInference`](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala)
 here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83056944
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
--- End diff --

It would be nice to turn style back on here since most of this function is 
not auto generated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83056792
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/test/JavaStringLength.java ---
@@ -0,0 +1,30 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.test;
+
+import org.apache.spark.sql.api.java.UDF1;
+
+/**
+ * It is used for register Java UDF from PySpark
+ */
+public class JavaStringLength implements UDF1 {
--- End diff --

Could this be moved to `src/test`?  It would be better to not distribute it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83056607
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,32 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+@ignore_unicode_prefix
+@since(2.1)
+def registerJavaFunction(self, name, javaClassName, returnType=None):
+"""Register a java UDF so it can be used in SQL statements.
+
+In addition to a name and the function itself, the return type can 
be optionally specified.
+When the return type is not given it would infer the returnType 
via reflection.
--- End diff --

nit: its a little odd to mix `return type` with `returnType`.  Perhaps, 
"When the return type is not specified we attempt to infer it using reflection"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r83057132
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -414,6 +418,84 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class using reflection, for use from pyspark
+   *
+   * @param name   udf name
+   * @param className   fully qualified class name of udf
+   * @param returnDataType  return type of udf. If it is null, spark would 
try to infer
+   *via reflection.
+   */
+  def registerJava(name: String, className: String, returnDataType: 
DataType): Unit = {
--- End diff --

Is it possible to make this non-public?  I believe we do this in other 
cases for code only called from python.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82969210
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,10 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+def registerJavaFunction(self, name, javaClassName, returnType):
--- End diff --

Fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82969338
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -17,9 +17,15 @@
 
 package org.apache.spark.sql
 
+
+import java.io.IOException
+import java.util.{List => JList, Map => JMap}
+
 import scala.reflect.runtime.universe.TypeTag
 import scala.util.Try
 
+import sun.reflect.generics.reflectiveObjects.ParameterizedTypeImpl
--- End diff --

Fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-12 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82969355
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class
+   * @param name
+   * @param className
+   * @param returnType
+   */
+  def registerJava(name: String, className: String, returnType: DataType): 
Unit = {
+
+try {
+  // scalastyle:off classforname
--- End diff --

Fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82876529
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class
+   * @param name
+   * @param className
+   * @param returnType
+   */
+  def registerJava(name: String, className: String, returnType: DataType): 
Unit = {
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82876442
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class
+   * @param name
+   * @param className
+   * @param returnType
+   */
+  def registerJava(name: String, className: String, returnType: DataType): 
Unit = {
+
+try {
+  // scalastyle:off classforname
--- End diff --

This style rule is here to prevent misuse.  Is there a reason we aren't 
using our utility functions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82876419
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,26 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+@ignore_unicode_prefix
+@since(2.1)
+def registerJavaFunction(self, name, javaClassName, 
returnType=StringType()):
+"""Register a java UDF so it can be used in SQL statements.
+
+In addition to a name and the function itself, the return type can 
be optionally specified.
+When the return type is not given it default to a string and 
conversion will automatically
--- End diff --

Where does this conversion happen?  Are we sure that it works (given there 
are no tests that I see).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82876509
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -17,9 +17,15 @@
 
 package org.apache.spark.sql
 
+
+import java.io.IOException
+import java.util.{List => JList, Map => JMap}
+
 import scala.reflect.runtime.universe.TypeTag
 import scala.util.Try
 
+import sun.reflect.generics.reflectiveObjects.ParameterizedTypeImpl
--- End diff --

Is this JVM specific?  What is this being used for?  Is there another way?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-07 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r82450939
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,10 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+def registerJavaFunction(self, name, javaClassName, returnType):
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-04 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r81877809
  
--- Diff: python/pyspark/sql/context.py ---
@@ -202,6 +202,10 @@ def registerFunction(self, name, f, 
returnType=StringType()):
 """
 self.sparkSession.catalog.registerFunction(name, f, returnType)
 
+def registerJavaFunction(self, name, javaClassName, returnType):
--- End diff --

It would be good to have tests that call this from the Python side, also it 
probably needs a docstring and since annotation as well (see registerFunction 
since you take similar params).
It might also make sense to have the same default return type of StringType 
as the current registerFunction but that is more a matter of taste.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-08-03 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/9766#discussion_r73413872
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
@@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
   
//
 
   /**
+   * Register a Java UDF class
+   * @param name
+   * @param className
+   * @param returnType
+   */
+  def registerJava(name: String, className: String, returnType: DataType): 
Unit = {
--- End diff --

I don't know if we want to expose this API for general use - maybe make it 
private so that it can only be called from Python? And maybe update the 
scaladoc to something like "Register a Java UDF class using reflection - for 
use from Python".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org