Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22275
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r232420076
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.toPand
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r232420015
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,34 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.toPand
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r232145973
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.to
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r231311398
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.to
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r230423471
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.toPand
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r229522939
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.to
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r223197940
--- Diff: python/pyspark/sql/tests.py ---
@@ -4434,6 +4434,12 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.to
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r223116201
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId =
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r223116082
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId =
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219556033
--- Diff: python/pyspark/serializers.py ---
@@ -208,8 +214,26 @@ def load_stream(self, stream):
for batch in reader:
yield batc
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219558311
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId = spa
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219556534
--- Diff: python/pyspark/serializers.py ---
@@ -208,8 +214,26 @@ def load_stream(self, stream):
for batch in reader:
yield batc
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219561178
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId = spa
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219557215
--- Diff: python/pyspark/sql/tests.py ---
@@ -4434,6 +4434,12 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.toPand
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r219404072
--- Diff: python/pyspark/sql/tests.py ---
@@ -4434,6 +4434,12 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_python.to
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r214131313
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId =
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r214129436
--- Diff: python/pyspark/serializers.py ---
@@ -187,9 +187,15 @@ def loads(self, obj):
class ArrowStreamSerializer(Serializer):
"""
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r213862165
--- Diff: python/pyspark/serializers.py ---
@@ -187,9 +187,15 @@ def loads(self, obj):
class ArrowStreamSerializer(Serializer):
"""
-
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/22275#discussion_r213860328
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -3279,34 +3280,33 @@ class Dataset[T] private[sql](
val timeZoneId = spar
GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/22275
[SPARK-25274][PYTHON][SQL] In toPandas with Arrow send out-of-order record
batches to improve performance
## What changes were proposed in this pull request?
When executing `toPandas`
21 matches
Mail list logo