[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18787#discussion_r132188490 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java --- @@ -65,15 +65,35 @@ final Row row

[GitHub] spark pull request #18933: [WIP][SPARK-21722][SQL][PYTHON] Enable timezone-a...

2017-08-15 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18933#discussion_r133229705 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -912,6 +912,14 @@ object SQLConf { .intConf

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-05-15 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/15821 @BryanCutler , is Timestamp and Date type supported now with Arrow 0.3? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15821: [SPARK-13534][PySpark] Using Apache Arrow to increase pe...

2017-05-16 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/15821 >@icexelloss , yes Arrow supports it but Spark stores timestamps is a different way which caused some complication. After talking with Holden, we agreed it was better to keep this PR to sim

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-12-05 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 Hi @BryanCutler @HyukjinKwon @ueshin , mind taking another look? I think this is in a good shape. Thanks! --- - To

[GitHub] spark pull request #23248: [SPARK-26293][SQL] Cast exception when having pyt...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/23248#discussion_r239565253 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -131,8 +131,20 @@ object ExtractPythonUDFs

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239587020 --- Diff: python/pyspark/sql/tests/test_pandas_udf_window.py --- @@ -44,9 +44,18 @@ def python_plus_one(self): @property def

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239587089 --- Diff: python/pyspark/sql/tests/test_pandas_udf_window.py --- @@ -231,12 +266,10 @@ def test_array_type(self): self.assertEquals(result1

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239587065 --- Diff: python/pyspark/sql/tests/test_pandas_udf_window.py --- @@ -87,8 +96,34 @@ def ordered_window(self): def unpartitioned_window(self

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239587136 --- Diff: python/pyspark/sql/tests/test_pandas_udf_window.py --- @@ -245,11 +278,101 @@ def test_invalid_args(self): foo_udf

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239587375 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -144,24 +282,107 @@ case class

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-12-07 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r239922856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -144,24 +282,107 @@ case class

[GitHub] spark pull request #23248: [SPARK-26293][SQL] Cast exception when having pyt...

2018-12-07 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/23248#discussion_r239925749 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -131,8 +131,20 @@ object ExtractPythonUDFs

[GitHub] spark pull request #22208: [SPARK-25216][SQL] Improve error message when a c...

2018-08-24 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22208#discussion_r212716787 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -216,8 +216,16 @@ class Dataset[T] private[sql]( private[sql] def

[GitHub] spark issue #22244: [WIP][SPARK-24721][SPARK-25213][SQL] extract python UDF ...

2018-08-27 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22244 @cloud-fan Thanks! I will take a look later today and incorporate this with my patch. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...

2018-08-28 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 Thanks all for the review! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22208: [SPARK-25216][SQL] Improve error message when a column c...

2018-08-28 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22208 @dongjoon-hyun Could please take another look? I changed to use resolver and try to resolve column with backticks and added unit tests as well

[GitHub] spark issue #22208: [SPARK-25216][SQL] Improve error message when a column c...

2018-08-29 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22208 @dongjoon-hyun SGTM. I misunderstood your suggestion about resolver. Keeping it simple was my preference too. --- - To

[GitHub] spark pull request #22305: [WIP][SPARK-24561][SQL][Python] User-defined wind...

2018-08-31 Thread icexelloss
GitHub user icexelloss opened a pull request: https://github.com/apache/spark/pull/22305 [WIP][SPARK-24561][SQL][Python] User-defined window aggregation functions with Pandas UDF (bounded window) ## What changes were proposed in this pull request? ### **This is currently

[GitHub] spark issue #22305: [WIP][SPARK-24561][SQL][Python] User-defined window aggr...

2018-08-31 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 The current state is a minimum working version - I copied some code from `WindowExec` to make this work but will need to refactor those

[GitHub] spark issue #22104: [SPARK-24721][SQL] Extract Python UDFs at the end of opt...

2018-09-03 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 @cloud-fan Sure! Updated --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22329: [SPARK-25328][PYTHON] Add an example for having t...

2018-09-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22329#discussion_r214940744 --- Diff: python/pyspark/sql/functions.py --- @@ -2804,6 +2804,20 @@ def pandas_udf(f=None, returnType=None, functionType=None): | 1|1.5

[GitHub] spark pull request #22329: [SPARK-25328][PYTHON] Add an example for having t...

2018-09-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22329#discussion_r215267320 --- Diff: python/pyspark/sql/functions.py --- @@ -2804,6 +2804,22 @@ def pandas_udf(f=None, returnType=None, functionType=None): | 1|1.5

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22329 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22305: [WIP][SPARK-24561][SQL][Python] User-defined window aggr...

2018-09-14 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22305: [WIP][SPARK-24561][SQL][Python] User-defined wind...

2018-09-17 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r218243887 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the

[GitHub] spark pull request #22305: [WIP][SPARK-24561][SQL][Python] User-defined wind...

2018-09-17 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r218244042 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-10-23 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 @felixcheung I am waiting for some in-depth review. @ueshin do you have some time to review this in the near future? Thanks

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-10-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r227591428 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -63,7 +65,7 @@ private[spark] object PythonEvalType

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-10-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r227591518 --- Diff: python/pyspark/sql/tests.py --- @@ -6481,12 +6516,116 @@ def test_invalid_args(self): foo_udf = pandas_udf(lambda x: x, &#

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-10-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r227591746 --- Diff: python/pyspark/sql/tests.py --- @@ -6323,6 +6333,33 @@ def ordered_window(self): def unpartitioned_window(self): return

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-05 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 Hey @gatorsmile it has been quite a while with no review progress on this. @BryanCutler has some initial comments but I want to get more people's feedback before addressing those. Since no

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 No worries. Thank you @HyukjinKwon and @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-08 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232084279 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -73,68 +118,147 @@ case class

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232388369 --- Diff: python/pyspark/worker.py --- @@ -154,6 +154,47 @@ def wrapped(*series): return lambda *a: (wrapped(*a), arrow_return_type

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232393187 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -73,68 +118,147 @@ case class

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232393305 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -73,68 +118,147 @@ case class

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232393452 --- Diff: python/pyspark/sql/tests.py --- @@ -6323,6 +6333,33 @@ def ordered_window(self): def unpartitioned_window(self): return

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232393335 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -27,17 +27,62 @@ import

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232393476 --- Diff: python/pyspark/sql/tests.py --- @@ -6323,6 +6333,33 @@ def ordered_window(self): def unpartitioned_window(self): return

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r234790403 --- Diff: python/pyspark/sql/tests.py --- @@ -7064,12 +7098,104 @@ def test_invalid_args(self): foo_udf = pandas_udf(lambda x: x, &#

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r234790364 --- Diff: python/pyspark/sql/tests.py --- @@ -89,6 +89,7 @@ from pyspark.sql.types import _merge_type from pyspark.tests import QuietTest

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r234790633 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -73,68 +118,151 @@ case class

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-19 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r234790479 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -27,17 +27,62 @@ import

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-20 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r235182927 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -63,7 +65,7 @@ private[spark] object PythonEvalType

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-20 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22305 @BryanCutler @HyukjinKwon @ueshin I have addressed all the comments so far. Could you please take another look? Thanks

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-21 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r235417425 --- Diff: python/pyspark/worker.py --- @@ -154,6 +154,47 @@ def wrapped(*series): return lambda *a: (wrapped(*a), arrow_return_type

<    3   4   5   6   7   8