[GitHub] spark issue #18494: [SPARK-21272] SortMergeJoin LeftAnti does not update num...

2017-07-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/18494
  
cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18444
  
I think you could push an empty commit to retrigger the Jenkins build as a 
workaround, for example, ` git commit --allow-empty -m "Retrigger the build"` 
if the test blocks what you are working on. I can't trigger the build too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue:

https://github.com/apache/spark/pull/18444
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue:

https://github.com/apache/spark/pull/18444
  
I'm not sure if Jenkins would listen to me :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue:

https://github.com/apache/spark/pull/18444
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125173526
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2250,6 +2256,67 @@ def test_BinaryType_serialization(self):
 df = self.spark.createDataFrame(data, schema=schema)
 df.collect()
 
+# test for SPARK-16542
+def test_array_types(self):
+# This test need to make sure that the Scala type selected is at 
least
+# as large as the python's types. This is necessary because 
python's
+# array types depend on C implementation on the machine. Therefore 
there
+# is no machine independent correspondence between python's array 
types
+# and Scala types.
+# See: https://docs.python.org/2/library/array.html
+
+def assertCollectSuccess(typecode, value):
+a = array.array(typecode, [value])
+row = Row(myarray=a)
+df = self.spark.createDataFrame([row])
+self.assertEqual(df.collect()[0]["myarray"][0], value)
+
+supported_types = []
+
+# test string types
+if sys.version < "4":
--- End diff --

Yes, this looks better I think. I will change that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125173517
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2250,6 +2256,67 @@ def test_BinaryType_serialization(self):
 df = self.spark.createDataFrame(data, schema=schema)
 df.collect()
 
+# test for SPARK-16542
+def test_array_types(self):
+# This test need to make sure that the Scala type selected is at 
least
+# as large as the python's types. This is necessary because 
python's
+# array types depend on C implementation on the machine. Therefore 
there
+# is no machine independent correspondence between python's array 
types
+# and Scala types.
+# See: https://docs.python.org/2/library/array.html
+
+def assertCollectSuccess(typecode, value):
+a = array.array(typecode, [value])
+row = Row(myarray=a)
+df = self.spark.createDataFrame([row])
+self.assertEqual(df.collect()[0]["myarray"][0], value)
+
+supported_types = []
+
+# test string types
+if sys.version < "4":
+supported_types += ['u']
+assertCollectSuccess('u', "a")
+if sys.version < "3":
+supported_types += ['c']
+assertCollectSuccess('c', "a")
+
+# test float and double, assuming IEEE 754 floating-point format
+supported_types += ['f', 'd']
+assertCollectSuccess('f', ctypes.c_float(1e+38).value)
+assertCollectSuccess('f', ctypes.c_float(1e-38).value)
+assertCollectSuccess('f', ctypes.c_float(1.123456).value)
+assertCollectSuccess('d', ctypes.c_double(1e+308).value)
+assertCollectSuccess('d', ctypes.c_double(1e+308).value)
+assertCollectSuccess('d', ctypes.c_double(1.123456789012345).value)
+
+# test int types
+supported_int = 
list(set(_array_int_typecode_ctype_mappings.keys()).
+ 
intersection(set(_array_type_mappings.keys(
+supported_types += supported_int
+for i in supported_int:
+ctype = _array_int_typecode_ctype_mappings[i]
+if i.isupper():
--- End diff --

Yes, good idea. I will change that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125173508
  
--- Diff: python/pyspark/sql/types.py ---
@@ -935,6 +936,86 @@ def _parse_datatype_json_value(json_value):
 long: LongType,
 })
 
+# Mapping Python array types to Spark SQL DataType
+# We should be careful here. The size of these types in python depends on C
+# implementation. We need to make sure that this conversion does not lose 
any
+# precision.
+#
+# Reference for C integer size, see:
+# ISO/IEC 9899:201x specification, § 5.2.4.2.1 Sizes of integer types 
.
+# Reference for python array typecode, see:
+# https://docs.python.org/2/library/array.html
+# https://docs.python.org/3.6/library/array.html
+
+_array_int_typecode_ctype_mappings = {
+'b': ctypes.c_byte,
+'B': ctypes.c_ubyte,
+'h': ctypes.c_short,
+'H': ctypes.c_ushort,
+'i': ctypes.c_int,
+'I': ctypes.c_uint,
+'l': ctypes.c_long,
+'L': ctypes.c_ulong
+}
+
+# TODO: Uncomment this when 'q' and 'Q' are supported by 
net.razorvine.pickle
+# Type code 'q' and 'Q' are not available at python 2
+# if sys.version > "2":
+# _array_int_typecode_ctype_mappings.update({
+# 'q': ctypes.c_longlong,
+# 'Q': ctypes.c_ulonglong
+# })
+
+
+def _int_size_to_type(size):
+"""
+Return the Scala type from the size of integers.
+"""
+if size <= 8:
+return ByteType
+if size <= 16:
+return ShortType
+if size <= 32:
+return IntegerType
+if size <= 64:
+return LongType
+raise TypeError("not supported type: integer size too large.")
--- End diff --

But I think I would add a simple line of comment on this to make the code 
more readable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125173496
  
--- Diff: python/pyspark/sql/types.py ---
@@ -935,6 +936,86 @@ def _parse_datatype_json_value(json_value):
 long: LongType,
 })
 
+# Mapping Python array types to Spark SQL DataType
+# We should be careful here. The size of these types in python depends on C
+# implementation. We need to make sure that this conversion does not lose 
any
+# precision.
+#
+# Reference for C integer size, see:
+# ISO/IEC 9899:201x specification, § 5.2.4.2.1 Sizes of integer types 
.
+# Reference for python array typecode, see:
+# https://docs.python.org/2/library/array.html
+# https://docs.python.org/3.6/library/array.html
+
+_array_int_typecode_ctype_mappings = {
+'b': ctypes.c_byte,
+'B': ctypes.c_ubyte,
+'h': ctypes.c_short,
+'H': ctypes.c_ushort,
+'i': ctypes.c_int,
+'I': ctypes.c_uint,
+'l': ctypes.c_long,
+'L': ctypes.c_ulong
+}
+
+# TODO: Uncomment this when 'q' and 'Q' are supported by 
net.razorvine.pickle
+# Type code 'q' and 'Q' are not available at python 2
+# if sys.version > "2":
+# _array_int_typecode_ctype_mappings.update({
+# 'q': ctypes.c_longlong,
+# 'Q': ctypes.c_ulonglong
+# })
+
+
+def _int_size_to_type(size):
+"""
+Return the Scala type from the size of integers.
+"""
+if size <= 8:
+return ByteType
+if size <= 16:
+return ShortType
+if size <= 32:
+return IntegerType
+if size <= 64:
+return LongType
+raise TypeError("not supported type: integer size too large.")
--- End diff --

I don't think we should log this. This is just a helper function that helps 
to construct `_array_type_mappings`, which is a complete list of all supported 
type codes. Being filtered out here is not an error, it's by design. If users 
try to use unsupported typecode, they will see another `TypeError` due to line 
1052:
```python
.
elif isinstance(obj, array):
if obj.typecode in _array_type_mappings:
return ArrayType(_array_type_mappings[obj.typecode](), True)
else:
raise TypeError("not supported type: array(%s)" % obj.typecode)
.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125173449
  
--- Diff: python/pyspark/sql/types.py ---
@@ -935,6 +936,86 @@ def _parse_datatype_json_value(json_value):
 long: LongType,
 })
 
+# Mapping Python array types to Spark SQL DataType
+# We should be careful here. The size of these types in python depends on C
+# implementation. We need to make sure that this conversion does not lose 
any
+# precision.
+#
+# Reference for C integer size, see:
+# ISO/IEC 9899:201x specification, § 5.2.4.2.1 Sizes of integer types 
.
+# Reference for python array typecode, see:
+# https://docs.python.org/2/library/array.html
+# https://docs.python.org/3.6/library/array.html
+
+_array_int_typecode_ctype_mappings = {
+'b': ctypes.c_byte,
+'B': ctypes.c_ubyte,
+'h': ctypes.c_short,
+'H': ctypes.c_ushort,
+'i': ctypes.c_int,
+'I': ctypes.c_uint,
+'l': ctypes.c_long,
+'L': ctypes.c_ulong
+}
+
+# TODO: Uncomment this when 'q' and 'Q' are supported by 
net.razorvine.pickle
+# Type code 'q' and 'Q' are not available at python 2
+# if sys.version > "2":
+# _array_int_typecode_ctype_mappings.update({
+# 'q': ctypes.c_longlong,
+# 'Q': ctypes.c_ulonglong
+# })
+
+
+def _int_size_to_type(size):
+"""
+Return the Scala type from the size of integers.
+"""
+if size <= 8:
+return ByteType
+if size <= 16:
+return ShortType
+if size <= 32:
+return IntegerType
+if size <= 64:
+return LongType
+raise TypeError("not supported type: integer size too large.")
+
+_array_type_mappings = {
+# Warning: Actual properties for float and double in C is not 
unspecified.
--- End diff --

Yes, you are correct. Thanks for figuring this out. 

I just checked, `sys.float_info` is the info for C type double, not for C 
type float:
```python
>>> import sys
>>> sys.float_info.max
1.7976931348623157e+308
>>> sys.float_info.dig
15
```
So we can not use this to check range for C float. But this is not the main 
reason I'm not using it. 

The main reason is:
Although C does not specify that we have to use IEEE-754 floating point 
types, all the C platform I have ever seen uses IEEE-754. (Also there is a 
StackOverflow question about this: 
https://stackoverflow.com/questions/31967040/is-it-safe-to-assume-floating-point-is-represented-using-ieee754-floats-in-c
 ) I don't even know if there exists a platform in the world that: python is 
supported, JVM is supported, and floating point types in C does not use 
IEEE-754. So, I think it would be OK to assume that these types are IEEE-754 
for now to make the code cleaner. It does not worth any effort to support 
something that might not even exist. But I'm not an expert on this either, so 
if you know someone that might know more on this, please ping them to double 
check. On the other hand, If there do exist users that use these platform find 
that this is a wrong assumption, they can report a new bug to fix this.  But 
yes, my comment seems to be confusing and I will try if I can make it clearer.

Also, thank you for pointing out the `sys.float_info`, although I don't 
think I need to use it here, it would be very useful in test cases. I will 
change part of my test cases to use it to make code more readable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17096
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79038/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17096
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17096
  
**[Test build #79038 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79038/testReport)**
 for PR 17096 at commit 
[`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18502: [SPARK-21278][PySpark] Upgrade to Py4J 0.10.5

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18502
  
**[Test build #79039 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79039/testReport)**
 for PR 18502 at commit 
[`bfdfcf5`](https://github.com/apache/spark/commit/bfdfcf52b26a598bf49289fd0646d667024264e5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18502: [SPARK-21278][PySpark] Upgrade to Py4J 0.10.5

2017-07-01 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/18502

[SPARK-21278][PySpark] Upgrade to Py4J 0.10.5

## What changes were proposed in this pull request?

This PR aims to bump Py4J in order to fix the following float/double bug.
Py4J 0.10.5 fixes this (https://github.com/bartdag/py4j/issues/272).

**BEFORE**
```
>>> df = spark.range(1)
>>> df.select(df['id'] + 17.133574204226083).show()
++
|(id + 17.1335742042)|
++
|   17.1335742042|
++
```

**AFTER**
```
>>> df = spark.range(1)
>>> df.select(df['id'] + 17.133574204226083).show()
+-+
|(id + 17.133574204226083)|
+-+
|   17.133574204226083|
+-+
```

## How was this patch tested?

Manual.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-21278

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18502.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18502


commit bfdfcf52b26a598bf49289fd0646d667024264e5
Author: Dongjoon Hyun 
Date:   2017-07-02T04:18:32Z

[SPARK-21278][PySpark] Upgrade to Py4J 0.10.5




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18139: [SPARK-20787][PYTHON] PySpark can't handle datetimes bef...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/18139
  
One option, which wouldn't be super efficient at border cases, but would 
allow for general case is just falling back on `OverflowError` (on top of 
avoiding it in the known to cause error range according to Python docs).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17096
  
**[Test build #79038 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79038/testReport)**
 for PR 17096 at commit 
[`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing suppor...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17096#discussion_r125173260
  
--- Diff: python/pyspark/ml/tests.py ---
@@ -319,6 +320,20 @@ def test_hasparam(self):
 testParams = TestParams()
 self.assertTrue(all([testParams.hasParam(p.name) for p in 
testParams.params]))
 self.assertFalse(testParams.hasParam("notAParameter"))
+self.assertTrue(testParams.hasParam(u"maxIter"))
+
+def test_resolveparam(self):
+testParams = TestParams()
+self.assertEqual(testParams._resolveParam(testParams.maxIter), 
testParams.maxIter)
+self.assertEqual(testParams._resolveParam("maxIter"), 
testParams.maxIter)
+
+self.assertEqual(testParams._resolveParam(u"maxIter"), 
testParams.maxIter)
+if sys.version_info[0] >= 3:
--- End diff --

@holdenk, would this test address your concern enough?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue:

https://github.com/apache/spark/pull/18444
  
jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18459: [SPARK-13534][PYSPARK] Using Apache Arrow to increase pe...

2017-07-01 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/18459
  
nah, i got it.

On Sat, Jul 1, 2017 at 7:19 PM, Holden Karau 
wrote:

> @shaneknapp  - I can do the update if you
> want next week? Let me know :)
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79037/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79037 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79037/testReport)**
 for PR 18498 at commit 
[`0ba6094`](https://github.com/apache/spark/commit/0ba6094be3a743cd2834871115880bbe2fdd1c12).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more user-...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/17267
  
Hey @uncleGen anytime to add a test for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15113: [SPARK-17508][PYSPARK][ML] PySpark treat Param values No...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/15113
  
So I think our current handling of `None` is confusing, but I'd really like 
to see what the other committers have to say on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18501: [SPARK-20256][SQL] SessionState should be created more l...

2017-07-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18501
  
Hi, @cloud-fan .
Could you review this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18501: [SPARK-20256][SQL] SessionState should be created more l...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18501
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18501: [SPARK-20256][SQL] SessionState should be created more l...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18501
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79036/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18501: [SPARK-20256][SQL] SessionState should be created more l...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18501
  
**[Test build #79036 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79036/testReport)**
 for PR 18501 at commit 
[`8517fe8`](https://github.com/apache/spark/commit/8517fe89068e5fe695a3647c15e539c9a55f0d7e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18453: [SPARK-19852][PYSPARK][ML] Python StringIndexer supports...

2017-07-01 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18453
  
LGTM too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/17096
  
@HyukjinKwon Pretty much, yes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17096
  
Do you mean a test case such as 
`self.assertEqual(testParams._resolveParam(u"아"), testParams.아)
` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18453: [SPARK-19852][PYSPARK][ML] Python StringIndexer supports...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/18453
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18453: [SPARK-19852][PYSPARK][ML] Python StringIndexer s...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18453#discussion_r125172564
  
--- Diff: python/pyspark/ml/feature.py ---
@@ -2132,6 +2132,12 @@ class StringIndexer(JavaEstimator, HasInputCol, 
HasOutputCol, HasHandleInvalid,
 "frequencyDesc, frequencyAsc, alphabetDesc, 
alphabetAsc.",
 typeConverter=TypeConverters.toString)
 
+handleInvalid = Param(Params._dummy(), "handleInvalid", "how to handle 
invalid data (unseen " +
--- End diff --

We do, although since this is currently the only concrete use of 
`handleInvalid` -- but other models are slated to start using it in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125172166
  
--- Diff: python/pyspark/sql/types.py ---
@@ -935,6 +936,86 @@ def _parse_datatype_json_value(json_value):
 long: LongType,
 })
 
+# Mapping Python array types to Spark SQL DataType
+# We should be careful here. The size of these types in python depends on C
+# implementation. We need to make sure that this conversion does not lose 
any
+# precision.
+#
+# Reference for C integer size, see:
+# ISO/IEC 9899:201x specification, § 5.2.4.2.1 Sizes of integer types 
.
+# Reference for python array typecode, see:
+# https://docs.python.org/2/library/array.html
+# https://docs.python.org/3.6/library/array.html
+
+_array_int_typecode_ctype_mappings = {
+'b': ctypes.c_byte,
+'B': ctypes.c_ubyte,
+'h': ctypes.c_short,
+'H': ctypes.c_ushort,
+'i': ctypes.c_int,
+'I': ctypes.c_uint,
+'l': ctypes.c_long,
+'L': ctypes.c_ulong
+}
+
+# TODO: Uncomment this when 'q' and 'Q' are supported by 
net.razorvine.pickle
+# Type code 'q' and 'Q' are not available at python 2
+# if sys.version > "2":
+# _array_int_typecode_ctype_mappings.update({
+# 'q': ctypes.c_longlong,
+# 'Q': ctypes.c_ulonglong
+# })
+
+
+def _int_size_to_type(size):
+"""
+Return the Scala type from the size of integers.
+"""
+if size <= 8:
+return ByteType
+if size <= 16:
+return ShortType
+if size <= 32:
+return IntegerType
+if size <= 64:
+return LongType
+raise TypeError("not supported type: integer size too large.")
--- End diff --

So in this case we are silently filtering out integer types from supported 
conversions - do we expect this to happen? Should we log this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125171058
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2250,6 +2256,67 @@ def test_BinaryType_serialization(self):
 df = self.spark.createDataFrame(data, schema=schema)
 df.collect()
 
+# test for SPARK-16542
+def test_array_types(self):
+# This test need to make sure that the Scala type selected is at 
least
+# as large as the python's types. This is necessary because 
python's
+# array types depend on C implementation on the machine. Therefore 
there
+# is no machine independent correspondence between python's array 
types
+# and Scala types.
+# See: https://docs.python.org/2/library/array.html
+
+def assertCollectSuccess(typecode, value):
+a = array.array(typecode, [value])
+row = Row(myarray=a)
+df = self.spark.createDataFrame([row])
+self.assertEqual(df.collect()[0]["myarray"][0], value)
+
+supported_types = []
+
+# test string types
+if sys.version < "4":
+supported_types += ['u']
+assertCollectSuccess('u', "a")
+if sys.version < "3":
+supported_types += ['c']
+assertCollectSuccess('c', "a")
+
+# test float and double, assuming IEEE 754 floating-point format
+supported_types += ['f', 'd']
+assertCollectSuccess('f', ctypes.c_float(1e+38).value)
+assertCollectSuccess('f', ctypes.c_float(1e-38).value)
+assertCollectSuccess('f', ctypes.c_float(1.123456).value)
+assertCollectSuccess('d', ctypes.c_double(1e+308).value)
+assertCollectSuccess('d', ctypes.c_double(1e+308).value)
+assertCollectSuccess('d', ctypes.c_double(1.123456789012345).value)
+
+# test int types
+supported_int = 
list(set(_array_int_typecode_ctype_mappings.keys()).
+ 
intersection(set(_array_type_mappings.keys(
+supported_types += supported_int
+for i in supported_int:
+ctype = _array_int_typecode_ctype_mappings[i]
+if i.isupper():
--- End diff --

In the code that makes the mapping you have a comment about isupper being 
unsigned, for Scala SQL devs who maybe have to debug this in the future, I'd 
duplicate this comment here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125171044
  
--- Diff: python/pyspark/sql/types.py ---
@@ -935,6 +936,86 @@ def _parse_datatype_json_value(json_value):
 long: LongType,
 })
 
+# Mapping Python array types to Spark SQL DataType
+# We should be careful here. The size of these types in python depends on C
+# implementation. We need to make sure that this conversion does not lose 
any
+# precision.
+#
+# Reference for C integer size, see:
+# ISO/IEC 9899:201x specification, § 5.2.4.2.1 Sizes of integer types 
.
+# Reference for python array typecode, see:
+# https://docs.python.org/2/library/array.html
+# https://docs.python.org/3.6/library/array.html
+
+_array_int_typecode_ctype_mappings = {
+'b': ctypes.c_byte,
+'B': ctypes.c_ubyte,
+'h': ctypes.c_short,
+'H': ctypes.c_ushort,
+'i': ctypes.c_int,
+'I': ctypes.c_uint,
+'l': ctypes.c_long,
+'L': ctypes.c_ulong
+}
+
+# TODO: Uncomment this when 'q' and 'Q' are supported by 
net.razorvine.pickle
+# Type code 'q' and 'Q' are not available at python 2
+# if sys.version > "2":
+# _array_int_typecode_ctype_mappings.update({
+# 'q': ctypes.c_longlong,
+# 'Q': ctypes.c_ulonglong
+# })
+
+
+def _int_size_to_type(size):
+"""
+Return the Scala type from the size of integers.
+"""
+if size <= 8:
+return ByteType
+if size <= 16:
+return ShortType
+if size <= 32:
+return IntegerType
+if size <= 64:
+return LongType
+raise TypeError("not supported type: integer size too large.")
+
+_array_type_mappings = {
+# Warning: Actual properties for float and double in C is not 
unspecified.
--- End diff --

So I think you mean to say "is not specified"
On the other hand, is there a reason we couldn't use sys.float_info to 
determine the ranges on this? I have not done a lot of poking with float arrays 
in Python though so just a question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18444#discussion_r125170990
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -2250,6 +2256,67 @@ def test_BinaryType_serialization(self):
 df = self.spark.createDataFrame(data, schema=schema)
 df.collect()
 
+# test for SPARK-16542
+def test_array_types(self):
+# This test need to make sure that the Scala type selected is at 
least
+# as large as the python's types. This is necessary because 
python's
+# array types depend on C implementation on the machine. Therefore 
there
+# is no machine independent correspondence between python's array 
types
+# and Scala types.
+# See: https://docs.python.org/2/library/array.html
+
+def assertCollectSuccess(typecode, value):
+a = array.array(typecode, [value])
+row = Row(myarray=a)
+df = self.spark.createDataFrame([row])
+self.assertEqual(df.collect()[0]["myarray"][0], value)
+
+supported_types = []
+
+# test string types
+if sys.version < "4":
--- End diff --

consider maybe using sys.version_info?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18459: [SPARK-13534][PYSPARK] Using Apache Arrow to increase pe...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/18459
  
@shaneknapp - I can do the update if you want next week? Let me know :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/17096
  
I'd really like to see that further test I was talking about, @HyukjinKwon 
-- it shouldn't be too hard to do in this pr right? Just add a unicode string 
which doesn't make sense in down converted ascii.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/14830
  
Gentle follow up ping. I've got some bandwith next week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() s...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/14579
  
@MLnick - or if you don't have a chance would it be ok for us to find 
someone (perhaps someone new to the project) to take this over and bring it to 
the finish line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18339: [SPARK-21094][PYTHON] Add popen_kwargs to launch_gateway

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/18339
  
This is interesting, I've got a similar approach I've been working on in 
https://github.com/apache/spark/pull/17298 which has some issues inside of 
PyPI. Would that suit your needs if I extended it to allow you to enable it 
manually in addition to when the pipe was overloaded?

Let me know.

In the meantime, jenkins ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79037 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79037/testReport)**
 for PR 18498 at commit 
[`0ba6094`](https://github.com/apache/spark/commit/0ba6094be3a743cd2834871115880bbe2fdd1c12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18498
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18498
  
Retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18501: [SPARK-20256][SQL] SessionState should be created lazily

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18501
  
**[Test build #79036 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79036/testReport)**
 for PR 18501 at commit 
[`8517fe8`](https://github.com/apache/spark/commit/8517fe89068e5fe695a3647c15e539c9a55f0d7e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18501: [SPARK-20256][SQL] SessionState should be created...

2017-07-01 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/18501

[SPARK-20256][SQL] SessionState should be created lazily

## What changes were proposed in this pull request?

`SessionState` is designed to be created lazily. However, in reality, it 
created immediately in `SparkSession.Builder.getOrCreate` 
([here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L943)).

This PR aims to recover the lazy behavior by keeping the options into 
`initialSessionOptions`. The benefit is like the following. Users can start 
`spark-shell` and use RDD operations without any problems.

**BEFORE**
```scala
$ bin/spark-shell
java.lang.IllegalArgumentException: Error while instantiating 
'org.apache.spark.sql.hive.HiveSessionStateBuilder':
  at ...
Caused by: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.security.AccessControlException: Permission denied: 
user=spark, access=READ, inode="/apps/hive/warehouse":hive:hdfs:drwx--
at 
```
As reported in SPARK-20256, this happens when the warehouse directory is 
not allowed for this user.

**AFTER**
```scala
$ bin/spark-shell
...
Welcome to
    __
 / __/__  ___ _/ /__
_\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0-SNAPSHOT
  /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sc.range(0, 10, 1).count()
res0: Long = 10
```

## How was this patch tested?

Manual.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-20256

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18501.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18501


commit 8517fe89068e5fe695a3647c15e539c9a55f0d7e
Author: Dongjoon Hyun 
Date:   2017-07-02T00:17:45Z

[SPARK-20256][SQL] SessionState should be created lazily




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-07-01 Thread keypointt
Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/17451
  
Oh now I got you, will do. Thank you  :)

Sent from my iPhone at Canada Place 🇨🇦

On Sat, Jul 1, 2017 at 4:39 PM Holden Karau 
wrote:

> *@holdenk* commented on this pull request.
>
> In mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala
> :
>
> > @@ -274,6 +274,29 @@ class Word2VecModel private[ml] (
>  wordVectors.findSynonyms(word, num)
>}
>
> +  /**
>
> Yes, so as I mentioned you could do the map function with the _1() and
> _2() to convert it entirely in the Python side.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79033/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79033 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79033/testReport)**
 for PR 18498 at commit 
[`bc5ab81`](https://github.com/apache/spark/commit/bc5ab81d91a4e646fa97e26378fb8ed2afb3a0e8).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/18444
  
@zasdfgbnm if the failures are unrelated you can ask jenkins to re-run the 
tests with "jenkins retest this please".

One thing is even though github reports this merges jenkins report it has 
conflicts, so to be safe I'd update to the latest master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17451: [SPARK-19866][ML][PySpark] Add local version of W...

2017-07-01 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17451#discussion_r125170914
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala 
---
@@ -274,6 +274,29 @@ class Word2VecModel private[ml] (
 wordVectors.findSynonyms(word, num)
   }
 
+  /**
--- End diff --

Yes, so as I mentioned you could do the map function with the `_1()` and 
`_2()` to convert it entirely in the Python side.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18500
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79034/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18500
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18500
  
**[Test build #79034 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79034/testReport)**
 for PR 18500 at commit 
[`2f03524`](https://github.com/apache/spark/commit/2f035245f35f967d10528ee66138862b98f12fcd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18353: [SPARK-21142][SS] spark-streaming-kafka-0-10 should depe...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18353
  
**[Test build #3825 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3825/testReport)**
 for PR 18353 at commit 
[`e5ec70b`](https://github.com/apache/spark/commit/e5ec70bc78268b7530a465e85a745ee70d801e18).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79035/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79035 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79035/testReport)**
 for PR 18498 at commit 
[`0ba6094`](https://github.com/apache/spark/commit/0ba6094be3a743cd2834871115880bbe2fdd1c12).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue:

https://github.com/apache/spark/pull/18444
  
@ueshin @HyukjinKwon I think I'm done now. There are still fails in tests, 
but it doesn't looks to be something related to my change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18464
  
Hi, @guoxiaolongzte .
Could you add the logic?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-for...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18498#discussion_r125169040
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1839,15 +1839,19 @@ def from_json(col, schema, options={}):
 string.
 
 :param col: string column in json format
-:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column
+:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column.
 :param options: options to control parsing. accepts the same options 
as the json datasource
 
+.. note:: Since Spark 2.3, the DDL-formatted string is also supported 
for ``schema``.
--- End diff --

Just to be sure, I think here we allow JSON format schema too but I wonder 
if we should describe this behaviour to promote this usage as we now have a 
DDL-formatted string. Actually, one of the reason why DDL was introduced is, 
inconvenience of this json format schema. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79035 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79035/testReport)**
 for PR 18498 at commit 
[`0ba6094`](https://github.com/apache/spark/commit/0ba6094be3a743cd2834871115880bbe2fdd1c12).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18500: [MINOR][DOC] Version related doc fix in functions...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at:

https://github.com/apache/spark/pull/18500


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18500
  
Actually, let me pick this up to https://github.com/apache/spark/pull/18498.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-for...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18498#discussion_r125168993
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3078,9 +3078,8 @@ object functions {
* string.
*
* @param e a string column containing JSON data.
-   * @param schema the schema to use when parsing the json string as a 
json string. In Spark 2.1,
-   *   the user-provided schema has to be in JSON format. 
Since Spark 2.2, the DDL
-   *   format is also supported for the schema.
+   * @param schema the schema to use when parsing the json string as a 
json string
+   *   or a DDL-formatted string.
--- End diff --

(These are strictly not related with this PR but i just decided to pick 
this up.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18500
  
**[Test build #79034 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79034/testReport)**
 for PR 18500 at commit 
[`2f03524`](https://github.com/apache/spark/commit/2f035245f35f967d10528ee66138862b98f12fcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18500
  
I tried to find the same instances in `functions.scala` and I believe these 
are all. Please pick this change up if anyone has a PR fixing some codes around 
this (rather than leaving it to make a conflict).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18500: [MINOR][DOC] Version related doc fix in functions.scala

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18500
  
I tried to find the same instances in `functions.scala` and I believe these 
are all. Please pick this change up if anyone has a PR fixing related this code 
path (rather than leaving it to make a conflict).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18500: [MINOR][DOC] Version related doc fix in functions...

2017-07-01 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/18500

[MINOR][DOC] Version related doc fix in functions.scala

## What changes were proposed in this pull request?

This PR proposes to 

- remove description about behaviour change information about 2.1.0 and 
2.2.0 in `from_json` added in 2.3.0.

- fix `@since 2.0` to `@since 2.0.0` for consistency in `hash`.

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark minor-doc-from_json

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18500.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18500


commit 2f035245f35f967d10528ee66138862b98f12fcd
Author: hyukjinkwon 
Date:   2017-07-01T20:28:54Z

Minor version related fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-for...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18498#discussion_r125168486
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1839,7 +1839,8 @@ def from_json(col, schema, options={}):
 string.
 
 :param col: string column in json format
-:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column
+:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column.
+   Since Spark 2.3, the DDL-formatted string is also 
supported for the schema.
--- End diff --

Up to my knowledge, we don't have a specific rule for describing additional 
behaviours in Python documentation (at least I see such changes were merged 
time to time). I tried to make this prettier as below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79033 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79033/testReport)**
 for PR 18498 at commit 
[`bc5ab81`](https://github.com/apache/spark/commit/bc5ab81d91a4e646fa97e26378fb8ed2afb3a0e8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18464
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-for...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18498#discussion_r125168275
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1839,15 +1839,19 @@ def from_json(col, schema, options={}):
 string.
 
 :param col: string column in json format
-:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column
+:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column.
 :param options: options to control parsing. accepts the same options 
as the json datasource
 
+.. note:: Since Spark 2.3, the DDL-formatted string is also supported 
for ``schema``.
--- End diff --

Now, it looks as below:

https://user-images.githubusercontent.com/6477701/27765087-ef86dbaa-5ee3-11e7-86ac-bbc5d61adc3c.png;>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18464
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79032/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18464
  
**[Test build #79032 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79032/testReport)**
 for PR 18464 at commit 
[`aa269a6`](https://github.com/apache/spark/commit/aa269a6a95078347f9b521b086acc34d911be9af).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18498
  
Thank you @felixcheung for your guidance. I address the comments here for 
now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18353: [SPARK-21142][SS] spark-streaming-kafka-0-10 should depe...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18353
  
**[Test build #3825 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3825/testReport)**
 for PR 18353 at commit 
[`e5ec70b`](https://github.com/apache/spark/commit/e5ec70bc78268b7530a465e85a745ee70d801e18).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-for...

2017-07-01 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18498#discussion_r125167705
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -1839,7 +1839,8 @@ def from_json(col, schema, options={}):
 string.
 
 :param col: string column in json format
-:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column
+:param schema: a StructType or ArrayType of StructType to use when 
parsing the json column.
+   Since Spark 2.3, the DDL-formatted string is also 
supported for the schema.
--- End diff --

is this generally how we indicate behavior changes in python?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/18498
  
cool, I will look more later today.
about `structType.fromDDL` - the other methods in this group, such as 
`structType.structField`, the name after the `.` is actually a type name. This 
is the S3 convention and that is how the method calls are routed (for example, 
when we call `structType(field, field)`)

We are not very consistent here (unfortunately, see `sparkR.*`) but in this 
case I think it's best not to mix with the convention. How about calling it 
`structTypeFromDDL`?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18451: [SPARK-18004][SQL] Make sure the date or timestamp relat...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18451
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79030/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18451: [SPARK-18004][SQL] Make sure the date or timestamp relat...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18451
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18451: [SPARK-18004][SQL] Make sure the date or timestamp relat...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18451
  
**[Test build #79030 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79030/testReport)**
 for PR 18451 at commit 
[`9808e63`](https://github.com/apache/spark/commit/9808e63a7a2055173e3e75a555e3a0d846e08dea).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14431: [SPARK-16258][SparkR] Automatically append the grouping ...

2017-07-01 Thread NarineK
Github user NarineK commented on the issue:

https://github.com/apache/spark/pull/14431
  
I think 'prepend' sounds  better. What do you think ? 
Yes, the `key` in `function(key, x) { x }` can be useful for some use cases 
but I also think that the user could easily prepend it to the dataframe if 
he/she needs it and since the `key` is already there.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79026/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18444
  
**[Test build #79026 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79026/testReport)**
 for PR 18444 at commit 
[`fe035a6`](https://github.com/apache/spark/commit/fe035a660958621b570132db3b8e4f1225e6bc47).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18498
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79031/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18498: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18498
  
**[Test build #79031 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79031/testReport)**
 for PR 18498 at commit 
[`bedf092`](https://github.com/apache/spark/commit/bedf09203574f2f53c5073df49c630b64e103e7f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18496: [SparkR][SPARK-20307]:SparkR: pass on setHandleInvalid t...

2017-07-01 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/18496
  
I will fix it tonight. It is weird. In my local test, it passed. It seems 
that my new change doesn't apply to the test. Anyway, I will fix the failure 
first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18464
  
**[Test build #79032 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79032/testReport)**
 for PR 18464 at commit 
[`aa269a6`](https://github.com/apache/spark/commit/aa269a6a95078347f9b521b086acc34d911be9af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79028/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18444
  
**[Test build #79028 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79028/testReport)**
 for PR 18444 at commit 
[`cca2d6a`](https://github.com/apache/spark/commit/cca2d6aeaff2368f20a13b24e13e82a85cd9ce5e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18464: [SPARK-21250][WEB-UI]Add a url in the table of 'Running ...

2017-07-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18464
  
Retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79027/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18444
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >