Re: [PR] [SPARK-47927][SQL]: Fix nullability attribute in UDF decoder [spark]

2024-04-27 Thread via GitHub


cloud-fan closed pull request #46156: [SPARK-47927][SQL]: Fix nullability 
attribute in UDF decoder
URL: https://github.com/apache/spark/pull/46156


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47927][SQL]: Fix nullability attribute in UDF decoder [spark]

2024-04-27 Thread via GitHub


cloud-fan commented on PR #46156:
URL: https://github.com/apache/spark/pull/46156#issuecomment-2081341603

   thanks, merging to master/3.5/3.4!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47927][SQL]: Fix nullability attribute in UDF decoder [spark]

2024-04-27 Thread via GitHub


cloud-fan commented on PR #46156:
URL: https://github.com/apache/spark/pull/46156#issuecomment-2081341418

   good catch!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48004][SQL] Add WriteFilesExecBase trait for v1 write [spark]

2024-04-27 Thread via GitHub


cloud-fan commented on PR #46240:
URL: https://github.com/apache/spark/pull/46240#issuecomment-2081340902

   late LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48002][PYTHON][SS] Add test for observed metrics in PySpark StreamingQueryListener [spark]

2024-04-27 Thread via GitHub


WweiL commented on PR #46237:
URL: https://github.com/apache/spark/pull/46237#issuecomment-2081327432

   @HyukjinKwon  I think we can merge this now : ) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47292][SS] safeMapToJValue should consider null typed values [spark]

2024-04-27 Thread via GitHub


WweiL commented on PR #46260:
URL: https://github.com/apache/spark/pull/46260#issuecomment-2081317940

   CC @HeartSaVioR PTAL, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-47292][SS] safeMapToJValue should consider null typed values [spark]

2024-04-27 Thread via GitHub


WweiL opened a new pull request, #46260:
URL: https://github.com/apache/spark/pull/46260

   
   
   ### What changes were proposed in this pull request?
   
   Additional null check to the `safeMapToJValue`. Normally we won't create a 
`StreamingQueryProgress` with map fields as null. It is also very unlikely in 
Spark Connect, but it is theoretically possible because we send the json 
directly in Spark Connect, so add this check for additional safety to not crash 
the server.
   
   ### Why are the changes needed?
   
   Minor bug fix
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Added unit test
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD][FOLLOWUP] add `--add-modules=jdk.incubator.vector` to maven compile args [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46259:
URL: https://github.com/apache/spark/pull/46259#issuecomment-2081311526

   > We can manually verify it through Maven test `build/mvn test -pl 
mllib-local`:
   > 
   > Before
   > 
   > 
![image](https://private-user-images.githubusercontent.com/1475305/326227170-1c002f85-175e-4554-a5a5-b05eab244f9c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTQyNzU2NzQsIm5iZiI6MTcxNDI3NTM3NCwicGF0aCI6Ii8xNDc1MzA1LzMyNjIyNzE3MC0xYzAwMmY4NS0xNzVlLTQ1NTQtYTVhNS1iMDVlYWIyNDRmOWMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDQyOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA0MjhUMDMzNjE0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YTlhOGI1MTJjMWU0MjE3M2Y3MmNjNGVhNGIwYmQyODc3MzJmN2MzZDVmYTJkZDM4NjNlMDViMzFmZTBkY2FmOCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ._Ae8O1t1Ofi7WlGtxQmaA_UWJxjm2XvAIiv65vMSfgs)
   > 
   > there is a WARNING message: `警告: Failed to load implementation 
from:dev.ludovic.netlib.blas.VectorBLAS`
   > 
   > After
   > 
   > 
![image](https://private-user-images.githubusercontent.com/1475305/326227204-a83b89c0-944d-45ce-9b96-572448d5d97e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTQyNzU2NzQsIm5iZiI6MTcxNDI3NTM3NCwicGF0aCI6Ii8xNDc1MzA1LzMyNjIyNzIwNC1hODNiODljMC05NDRkLTQ1Y2UtOWI5Ni01NzI0NDhkNWQ5N2UucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDQyOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA0MjhUMDMzNjE0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NDAyMWRmZDc1YzQwZmVlYzNkZmFlMjY1NGZkZGQ3YjMxM2IwZGZlODk0YTI0MTFhYTg1NDFiOTZlMDgzYzZjOCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.HjdC3RXp4IUCcRFBr5Oo_viTQzeQTxo4ceNqZHCMaqs)
   > 
   > no WARNING message related to `VectorBLAS`
   
   Yeah


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD][FOLLOWUP] add `--add-modules=jdk.incubator.vector` to maven compile args [spark]

2024-04-27 Thread via GitHub


LuciferYang commented on PR #46259:
URL: https://github.com/apache/spark/pull/46259#issuecomment-2081310548

   We can manually verify it through Maven test:
   
   Before
   
   
![image](https://github.com/apache/spark/assets/1475305/1c002f85-175e-4554-a5a5-b05eab244f9c)
   
   there is a WARNING message: `警告: Failed to load implementation 
from:dev.ludovic.netlib.blas.VectorBLAS`
   
   After
   
   
   
![image](https://github.com/apache/spark/assets/1475305/a83b89c0-944d-45ce-9b96-572448d5d97e)
   
   no WARNING message related  to `VectorBLAS`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48019] Fix incorrect behavior in ColumnVector/ColumnarArray with dictionary and nulls [spark]

2024-04-27 Thread via GitHub


cloud-fan closed pull request #46254: [SPARK-48019] Fix incorrect behavior in 
ColumnVector/ColumnarArray with dictionary and nulls
URL: https://github.com/apache/spark/pull/46254


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48019] Fix incorrect behavior in ColumnVector/ColumnarArray with dictionary and nulls [spark]

2024-04-27 Thread via GitHub


cloud-fan commented on PR #46254:
URL: https://github.com/apache/spark/pull/46254#issuecomment-2081305430

   thanks, merging to master/3.5!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD][FOLLOWUP] add `--add-modules=jdk.incubator.vector` to maven compile args [spark]

2024-04-27 Thread via GitHub


LuciferYang commented on PR #46259:
URL: https://github.com/apache/spark/pull/46259#issuecomment-2081303527

   Yes, we should keep `JavaModuleOptions`, `extraTestJavaArgs` in 
`SparkBuild.scala`, and `extraTestJavaArgs` in `pom.xml` consistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD][FOLLOWUP] add `--add-modules=jdk.incubator.vector` to maven compile args [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46259:
URL: https://github.com/apache/spark/pull/46259#issuecomment-2081303045

   cc @LuciferYang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2081300766

   > @panbingkun we should add `--add-modules=jdk.incubator.vector` to 
`extraJavaTestArgs ` in `pom.xml` too
   > 
   > 
https://github.com/apache/spark/blob/64d321926bbcede05d1c145405d503b3431f185b/pom.xml#L305-L323
   
   Okay, let me do it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48011][Core] Store LogKey name as a value to avoid generating new string instances [spark]

2024-04-27 Thread via GitHub


LuciferYang commented on PR #46249:
URL: https://github.com/apache/spark/pull/46249#issuecomment-2081298871

   late LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


LuciferYang commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2081298219

   @panbingkun we should add `--add-modules=jdk.incubator.vector` to `pom.xml` 
too
   
   
   
https://github.com/apache/spark/blob/64d321926bbcede05d1c145405d503b3431f185b/pom.xml#L305-L323


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [Only for check Docker Image] Check installed packages on ubuntu 22.04 [spark]

2024-04-27 Thread via GitHub


panbingkun opened a new pull request, #46258:
URL: https://github.com/apache/spark/pull/46258

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47516][INFRA] Move `remove unused installation package logic` from `each test job` to `create the docker image` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #45659:
URL: https://github.com/apache/spark/pull/45659#issuecomment-2081287287

   > @panbingkun
   > 
   > Hi, bingkun, when rebuild the image in 
https://github.com/zhengruifeng/spark/actions/runs/8857365994/job/24324764602
   > 
   > I see such warnings:
   > 
   > ```
   > #35 [29/31] RUN apt-get remove --purge -y '^aspnet.*' '^dotnet-.*' 
'^llvm-.*' 'php.*' '^mongodb-.*' snapd google-chrome-stable 
microsoft-edge-stable firefox azure-cli google-cloud-sdk mono-devel 
powershell libgl1-mesa-dri || true
   > #35 0.489 Reading package lists...
   > #35 0.505 Building dependency tree...
   > #35 0.507 Reading state information...
   > #35 0.511 E: Unable to locate package ^aspnet.*
   > #35 0.511 E: Couldn't find any package by glob '^aspnet.*'
   > #35 0.511 E: Couldn't find any package by regex '^aspnet.*'
   > #35 0.511 E: Unable to locate package ^dotnet-.*
   > #35 0.511 E: Couldn't find any package by glob '^dotnet-.*'
   > #35 0.511 E: Couldn't find any package by regex '^dotnet-.*'
   > #35 0.511 E: Unable to locate package ^llvm-.*
   > #35 0.511 E: Couldn't find any package by glob '^llvm-.*'
   > #35 0.511 E: Couldn't find any package by regex '^llvm-.*'
   > #35 0.511 E: Unable to locate package ^mongodb-.*
   > #35 0.511 E: Couldn't find any package by glob '^mongodb-.*'
   > #35 0.511 EPackage 'php-crypt-gpg' is not installed, so not removed
   > #35 0.511 Package 'php' is not installed, so not removed
   > #35 0.511 : Couldn't find any package by regex '^mongodb-.*'
   > #35 0.511 E: Unable to locate package snapd
   > #35 0.511 E: Unable to locate package google-chrome-stable
   > #35 0.511 E: Unable to locate package microsoft-edge-stable
   > #35 0.511 E: Unable to locate package firefox
   > #35 0.511 E: Unable to locate package azure-cli
   > #35 0.511 E: Unable to locate package google-cloud-sdk
   > #35 0.511 E: Unable to locate package mono-devel
   > #35 0.511 E: Unable to locate package powershell
   > #35 DONE 0.5s
   > 
   > #36 [30/31] RUN apt-get autoremove --purge -y
   > #36 0.063 Reading package lists...
   > #36 0.079 Building dependency tree...
   > #36 0.082 Reading state information...
   > #36 0.088 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
   > #36 DONE 0.4s
   > ```
   > 
   > would you mind help check whether this removal is needed in ubuntu 2024
   
   Sure, let me to do it.
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48024][PYTHON][CONNECT][TESTS] Enable `UDFParityTests.test_udf_timestamp_ntz` [spark]

2024-04-27 Thread via GitHub


zhengruifeng opened a new pull request, #46257:
URL: https://github.com/apache/spark/pull/46257

   ### What changes were proposed in this pull request?
   Enable `UDFParityTests.test_udf_timestamp_ntz`
   
   
   ### Why are the changes needed?
   for test coverage
   
   
   ### Does this PR introduce _any_ user-facing change?
   no, test only
   
   
   ### How was this patch tested?
   ci and manually test:
   ```
   (spark_dev_312) ➜  spark git:(master) ✗ python/run-tests -k 
--python-executables python3 --testnames 
'pyspark.sql.tests.connect.test_parity_udf 
UDFParityTests.test_udf_timestamp_ntz'
   Running PySpark tests. Output is in 
/Users/ruifeng.zheng/Dev/spark/python/unit-tests.log
   Will test against the following Python executables: ['python3']
   Will test the following Python tests: 
['pyspark.sql.tests.connect.test_parity_udf 
UDFParityTests.test_udf_timestamp_ntz']
   python3 python_implementation is CPython
   python3 version is: Python 3.12.2
   Starting test(python3): pyspark.sql.tests.connect.test_parity_udf 
UDFParityTests.test_udf_timestamp_ntz (temp output: 
/Users/ruifeng.zheng/Dev/spark/python/target/90afedde-8472-496c-8741-a3fd5792f6e2/python3__pyspark.sql.tests.connect.test_parity_udf_UDFParityTests.test_udf_timestamp_ntz__7yrowv9l.log)
   Finished test(python3): pyspark.sql.tests.connect.test_parity_udf 
UDFParityTests.test_udf_timestamp_ntz (10s)
   Tests passed in 10 seconds
   ```
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


dongjoon-hyun commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2081266374

   Merged to master. Thank you, @panbingkun and all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


dongjoon-hyun closed pull request #46246: [SPARK-48021][ML][BUILD] Add 
`--add-modules=jdk.incubator.vector` to `JavaModuleOptions`
URL: https://github.com/apache/spark/pull/46246


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48020][INFRA][PYTHON] Pin 'pandas==2.2.2' [spark]

2024-04-27 Thread via GitHub


dongjoon-hyun commented on PR #46256:
URL: https://github.com/apache/spark/pull/46256#issuecomment-2081265499

   Thank you all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


zhengruifeng commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2081262259

   also cc @WeichenXu123 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-46744][SPARK-SHELL][SQL][CONNECT][PYTHON][R] Display clear `exit command` for all spark terminal [spark]

2024-04-27 Thread via GitHub


github-actions[bot] commented on PR #44769:
URL: https://github.com/apache/spark/pull/44769#issuecomment-2081261642

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-44635][CORE] Handle shuffle fetch failures in decommissions [spark]

2024-04-27 Thread via GitHub


github-actions[bot] commented on PR #42296:
URL: https://github.com/apache/spark/pull/42296#issuecomment-2081261651

   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48020][INFRA][PYTHON] Pin 'pandas==2.2.2' [spark]

2024-04-27 Thread via GitHub


zhengruifeng commented on PR #46256:
URL: https://github.com/apache/spark/pull/46256#issuecomment-2081261491

   thank you @yaooqinn and @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [Don't review, only for test][SPARK-48022][BUILD] Upgrade `jersey` to `3.1.6` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46252:
URL: https://github.com/apache/spark/pull/46252#issuecomment-2081261447

   > The below MR may give some hints also to this ticket, bumping Jersey to 
v3.1.x requires all Spark to comply with EE10 standards, as I have tried during 
the Jetty 12 upgrade. #45500
   > 
   > In this particular case, seems Jetty and Jersey is a bundle deal.
   
   Thank you for your prompt.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [Don't review, only for test][SPARK-48022][BUILD] Upgrade `jersey` to `3.1.6` [spark]

2024-04-27 Thread via GitHub


HiuKwok commented on PR #46252:
URL: https://github.com/apache/spark/pull/46252#issuecomment-2081188223

   The below MR may give some hints also to this ticket, bumping Jersey to 
v3.1.x requires all Spark to comply with EE10 standards, as I have tried during 
the Jetty 12 upgrade.
   https://github.com/apache/spark/pull/45500


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2080865185

   > Thank you for looking into that! Let me know what I should do to update 
dev.ludovic.netlib further for the needs of Spark
   
   Thank all for writing in such `detail` in the previous PR process. Because 
of this, I can easily analyze and trace the details of history. ❤️


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48020][INFRA][PYTHON] Pin 'pandas==2.2.2' [spark]

2024-04-27 Thread via GitHub


yaooqinn closed pull request #46256: [SPARK-48020][INFRA][PYTHON] Pin 
'pandas==2.2.2'
URL: https://github.com/apache/spark/pull/46256


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48020][INFRA][PYTHON] Pin 'pandas==2.2.2' [spark]

2024-04-27 Thread via GitHub


yaooqinn commented on PR #46256:
URL: https://github.com/apache/spark/pull/46256#issuecomment-2080844028

   Thank you @zhengruifeng @HyukjinKwon 
   
   Merged to master. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


luhenry commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2080831209

   Thank you for looking into that! Let me know what I should do to update 
dev.ludovic.netlib further for the needs of Spark 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2080813720

   > Before this flag was gated on Java 21 - it's OK to set this on earlier 
versions? OK if so
   
   Yes, the JDK version of the above manual test environment (local) is `17`.
   https://github.com/apache/spark/assets/15246973/dba0297a-e51e-49a2-bdf6-f4268cb51c34;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


srowen commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2080797254

   Before this flag was gated on Java 21 - it's OK to set this on earlier 
versions? OK if so


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48021][ML][BUILD] Add `--add-modules=jdk.incubator.vector` to `JavaModuleOptions` [spark]

2024-04-27 Thread via GitHub


panbingkun commented on PR #46246:
URL: https://github.com/apache/spark/pull/46246#issuecomment-2080762133

   cc @luhenry @srowen @zhengruifeng @dongjoon-hyun @LuciferYang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-27 Thread via GitHub


jshmchenxi commented on code in PR #46149:
URL: https://github.com/apache/spark/pull/46149#discussion_r1581765866


##
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala:
##
@@ -35,7 +35,9 @@ import org.apache.spark.util.Utils
 
 class BasicDriverFeatureStepSuite extends SparkFunSuite {
 
-  private val CUSTOM_DRIVER_LABELS = Map("labelkey" -> "labelvalue")
+  private val CUSTOM_DRIVER_LABELS = Map(
+"labelkey" -> "labelvalue",
+"yunikorn.apache.org/app-id" -> "{{APPID}}")

Review Comment:
   Understood. Actually the label key used here can be any string. I've updated 
it to use a general label key as well as a general annotation key in this test. 
Also fixed the typo `APPID` -> `APP_ID`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47730][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholders in labels [spark]

2024-04-27 Thread via GitHub


jshmchenxi commented on PR #46149:
URL: https://github.com/apache/spark/pull/46149#issuecomment-2080412508

   It's been a busy week, sorry for the delay. I'll address your comments 
today, thanks! @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[PR] [SPARK-48020][INFRA][PYTHON] Pin 'pandas==2.2.2' [spark]

2024-04-27 Thread via GitHub


zhengruifeng opened a new pull request, #46256:
URL: https://github.com/apache/spark/pull/46256

   ### What changes were proposed in this pull request?
   1, pin 'pandas==2.2.2' for `pypy3.9`
   2, also change `pandas<=2.2.2` to 'pandas==2.2.2' to avoid unexpected 
version installation (e.g. for pypy3.8 `pandas<=2.2.2` actually installs 
version 2.0.3)
   
   ### Why are the changes needed?
   pypy had been upgraded
   
   
   ### Does this PR introduce _any_ user-facing change?
   no, test only
   
   
   ### How was this patch tested?
   ci
   
   ### Was this patch authored or co-authored using generative AI tooling?
   no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47516][INFRA] Move `remove unused installation package logic` from `each test job` to `create the docker image` [spark]

2024-04-27 Thread via GitHub


zhengruifeng commented on PR #45659:
URL: https://github.com/apache/spark/pull/45659#issuecomment-2080386388

   @panbingkun 
   
   Hi, bingkun, when rebuild the image in 
https://github.com/zhengruifeng/spark/actions/runs/8857365994/job/24324764602
   
   I see such warnings:
   ```
   #35 [29/31] RUN apt-get remove --purge -y '^aspnet.*' '^dotnet-.*' 
'^llvm-.*' 'php.*' '^mongodb-.*' snapd google-chrome-stable 
microsoft-edge-stable firefox azure-cli google-cloud-sdk mono-devel 
powershell libgl1-mesa-dri || true
   #35 0.489 Reading package lists...
   #35 0.505 Building dependency tree...
   #35 0.507 Reading state information...
   #35 0.511 E: Unable to locate package ^aspnet.*
   #35 0.511 E: Couldn't find any package by glob '^aspnet.*'
   #35 0.511 E: Couldn't find any package by regex '^aspnet.*'
   #35 0.511 E: Unable to locate package ^dotnet-.*
   #35 0.511 E: Couldn't find any package by glob '^dotnet-.*'
   #35 0.511 E: Couldn't find any package by regex '^dotnet-.*'
   #35 0.511 E: Unable to locate package ^llvm-.*
   #35 0.511 E: Couldn't find any package by glob '^llvm-.*'
   #35 0.511 E: Couldn't find any package by regex '^llvm-.*'
   #35 0.511 E: Unable to locate package ^mongodb-.*
   #35 0.511 E: Couldn't find any package by glob '^mongodb-.*'
   #35 0.511 EPackage 'php-crypt-gpg' is not installed, so not removed
   #35 0.511 Package 'php' is not installed, so not removed
   #35 0.511 : Couldn't find any package by regex '^mongodb-.*'
   #35 0.511 E: Unable to locate package snapd
   #35 0.511 E: Unable to locate package google-chrome-stable
   #35 0.511 E: Unable to locate package microsoft-edge-stable
   #35 0.511 E: Unable to locate package firefox
   #35 0.511 E: Unable to locate package azure-cli
   #35 0.511 E: Unable to locate package google-cloud-sdk
   #35 0.511 E: Unable to locate package mono-devel
   #35 0.511 E: Unable to locate package powershell
   #35 DONE 0.5s
   
   #36 [30/31] RUN apt-get autoremove --purge -y
   #36 0.063 Reading package lists...
   #36 0.079 Building dependency tree...
   #36 0.082 Reading state information...
   #36 0.088 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
   #36 DONE 0.4s
   ```
   
   would you mind help check whether this removal is needed in ubuntu 2024


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-48012][SQL] SPJ: Support Transfrom Expressions for One Side Shuffle [spark]

2024-04-27 Thread via GitHub


szehon-ho commented on PR #46255:
URL: https://github.com/apache/spark/pull/46255#issuecomment-2080381160

   Some implementation notes.  SPARK-41471 works by making the 
ShuffleExchangeExec side of the join have a KeyGroupedPartitioning, which is 
created by the other side's KeyGroupedShuffleSpec and is a clone of it (with 
the other side's partition expression and values).  That way both sides of the 
join have KeyGroupedPartioning and SPJ can work.
   
   Code changes:
   - Remove check in KeyGroupedShuffleSpec::canCreatePartitioning that allows 
only AttributeReference, and add support for TransformExpression 
   - Implement TransformExpression.eval(), by re-using the code from 
V2ExpressionUtils.  This allows the ShuffleExchangeExec to evaluate the 
partition key with transform expressions from each row.
   
   Some fixes:
   - normalize the valueMap key type in KeyGroupedPartitioner to use specific 
Seq implementation class.  Previously the partitioner's map are initialized 
with keys as Vector , but then compared with keys as ArraySeq, and these seem 
to have different hashcodes, so will always create new entries with new 
partition ids.  
   - add support in V2ExpressionUtil for Scala 'static' invoke() methods for 
ScalarFunctions (currently only Java static invoke() method is supported).  
This was needed, for example, in our test scala YearsTransform.
   - Change the test YearsTransform to have the same logic as the 
InMemoryBaseTable.  This was pointed out in 
[SPARK-41471](https://github.com/apache/spark/pull/42194) pr.
   
   Limitations:
   - This feature is disabled if partiallyClustered is enabled.  Partiallly 
clustered implies the partitioned side of the join has multiple partitions with 
the same value, and does not group them.  Not sure at the moment, how the 
shuffle side partitioner on the shuffle side can handle that.
   - This feature is disabled if allowJoinKeysLessThanPartitionKeys is enabled 
and partitions are transform expressions.  allowJoinKeysLessThanPartitionKeys 
feature works by 'grouping' the BatchScanExec's partitions again by join keys.  
If enabled along with this feature, there is a failure happens when checking 
that both sides of the join (ShuffleExchangeExec and the partitioned 
BatchScanExec side) have the same number of partitions.  This actually works in 
the first optimizer pass, as ShuffleExchangeExec's KeyGroupedPartioning is 
created as a clone of the other side (including partition values).  But after 
that there is a 'grouping' phase triggered here:
   
   ```
   // Now we need to push-down the common partition information to the 
scan in each child
   newLeft = populateCommonPartitionInfo(left, mergedPartValues, 
leftSpec.joinKeyPositions,
 leftReducers, applyPartialClustering, replicateLeftSide)
   newRight = populateCommonPartitionInfo(right, mergedPartValues, 
rightSpec.joinKeyPositions,
 rightReducers, applyPartialClustering, replicateRightSide)
   ```
   This updates the number of partitions on the BatchScanExec after the 
grouping by join key.  But it does not update the ShuffleExchangeExec number of 
partitons.  Hence the error in subsequent optimizer pass:
   ```
   requirement failed: PartitioningCollection requires all of its partitionings 
have the same numPartitions.
   java.lang.IllegalArgumentException: requirement failed: 
PartitioningCollection requires all of its partitionings have the same 
numPartitions.
at scala.Predef$.require(Predef.scala:337)
at 
org.apache.spark.sql.catalyst.plans.physical.PartitioningCollection.(partitioning.scala:550)
at 
org.apache.spark.sql.execution.joins.ShuffledJoin.outputPartitioning(ShuffledJoin.scala:49)
at 
org.apache.spark.sql.execution.joins.ShuffledJoin.outputPartitioning$(ShuffledJoin.scala:47)
at 
org.apache.spark.sql.execution.joins.SortMergeJoinExec.outputPartitioning(SortMergeJoinExec.scala:39)
at 
org.apache.spark.sql.execution.exchange.EnsureRequirements.$anonfun$ensureDistributionAndOrdering$1(EnsureRequirements.scala:66)
at scala.collection.immutable.Vector1.map(Vector.scala:2140)
at scala.collection.immutable.Vector1.map(Vector.scala:385)
at 
org.apache.spark.sql.execution.exchange.EnsureRequirements.org$apache$spark$sql$execution$exchange$EnsureRequirements$$ensureDistributionAndOrdering(EnsureRequirements.scala:65)
at 
org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$1.applyOrElse(EnsureRequirements.scala:657)
at 
org.apache.spark.sql.execution.exchange.EnsureRequirements$$anonfun$1.applyOrElse(EnsureRequirements.scala:632)
   ```
   This can be reproduced by removing this check and running the relevant unit 
test added in this pr.  It needs more investigation to be enabled in follow up 
pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go 

[PR] [SPARK-48012][SQL] SPJ: Support Transfrom Expressions for One Side Shuffle [spark]

2024-04-27 Thread via GitHub


szehon-ho opened a new pull request, #46255:
URL: https://github.com/apache/spark/pull/46255

   
### Why are the changes needed?
   
   Support SPJ one-side shuffle if other side has partition transform expression
   
   ### How was this patch tested?
   New unit test in KeyGroupedPartitioningSuite
   
   ### Was this patch authored or co-authored using generative AI tooling?
No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org