[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143481973


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Applied the comments. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143463571


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Ah, I got it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143211271


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Oh, it supposed to be used to create the default index of the pandas API on 
Spark in the follow-up PR.
   
   To test this function by applying it to the pandas API on Spark code, it 
requires several other files also must be modified.
   
   So I separated the current work for review convenience.



##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Or I could add another fixes to the current PR if the current fix looks good 
so far.
   
   If so, this PR would be more like "initial support for pandas API on Spark" 
rather than a "Add `_distributed_sequence_id` for distributed-sequence index.". 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143207159


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Or I could add another fixes to the current PR if the current fix looks good 
so far.
   
   If so, this PR would be more like "initial support for pandas API on Spark" 
rather than a "Add `_distributed_sequence_id` for distributed-sequence index.". 
:-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143198700


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Oh, it supposed to be used to create the default index of the pandas API on 
Spark in the follow-up PR.
   
   To test this function by applying it to the pandas API on Spark code, it 
requires several other files also must be modified.
   
   So I separate the current work for review convenience.



##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Oh, it supposed to be used to create the default index of the pandas API on 
Spark in the follow-up PR.
   
   To test this function by applying it to the pandas API on Spark code, it 
requires several other files also must be modified.
   
   So I separated the current work for review convenience.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143207159


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Or I could add another fixes to the current PR if the current fix looks good 
so far.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143198700


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Oh, it supposed to be used to create the default index of the pandas API on 
Spark in the follow-up PR.
   
   To test this function by applying it to the pandas API on Spark code, 
several other files also must be modified.
   
   So I separate the current work for review convenience.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143200759


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   On second thought, it would be good to have at least one test for example in 
current PR.
   
   Let me address it!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143200759


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   On second thought, it would be good to have at least one test for example in 
current PR.
   
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a diff in pull request #40507: [SPARK-42662][CONNECT][PS] Add `_distributed_sequence_id` for distributed-sequence index.

2023-03-21 Thread via GitHub


itholic commented on code in PR #40507:
URL: https://github.com/apache/spark/pull/40507#discussion_r1143198700


##
python/pyspark/sql/connect/functions.py:
##
@@ -2471,6 +2472,13 @@ def udf(
 udf.__doc__ = pysparkfuncs.udf.__doc__
 
 
+def _distributed_sequence_id() -> Column:

Review Comment:
   Oh, it supposed to be used to create the default index of the pandas API on 
Spark in the follow-up PR.
   
   Since the subsequent PR will include a many code fixes and tests, so I 
separate the current work for review convenience.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org