Re: [PR] [SPARK-48030][SQL] SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-30 Thread via GitHub
advancedxy commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2084474689 > LGTM. It would be nice to have something to configure this but I don't think it is super important. I feel the default value should be more than enough for most use cases? Similarly

Re: [PR] [SPARK-48030][SQL] SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
sunchao commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2084353517 Thanks! merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48030][SQL] SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
sunchao closed pull request #46265: [SPARK-48030][SQL] SPJ: cache rowOrdering and structType for InternalRowComparableWrapper URL: https://github.com/apache/spark/pull/46265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
szehon-ho commented on code in PR #46265: URL: https://github.com/apache/spark/pull/46265#discussion_r1583801503 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/InternalRowComparableWrapper.scala: ## @@ -53,6 +55,21 @@ class InternalRowComparableWrapper(val

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
advancedxy commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2082705978 > It appears that this PR can enhance performance when there are a large number of partitions. Could you please share the test results from a real DatasourceV2 table, such as Iceberg?

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
yabola commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2082672856 @advancedxy It appears that this PR can enhance performance when there are a large number of partitions. Could you please share the test results from a real DatasourceV2 table?

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-29 Thread via GitHub
advancedxy commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2082648145 > In addition, is there a better expireAfterAccess configuration for NonFateSharingCache? hmmm, maybe. However, the memory usage of cache should be relatively low. Let's wait

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-28 Thread via GitHub
advancedxy commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2081471602 @sunchao, @szehon-ho and @yabola would you mind to take a look at this and help review this one. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-28 Thread via GitHub
advancedxy commented on code in PR #46265: URL: https://github.com/apache/spark/pull/46265#discussion_r1582148061 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/InternalRowComparableWrapper.scala: ## @@ -53,6 +55,21 @@ class InternalRowComparableWrapper(val

Re: [PR] [SPARK-48030][SQL]SPJ: cache rowOrdering and structType for InternalRowComparableWrapper [spark]

2024-04-28 Thread via GitHub
advancedxy commented on PR #46265: URL: https://github.com/apache/spark/pull/46265#issuecomment-2081471221 Before applying changes in this PR, the benchmark code(in this PR) took: ``` [info] OpenJDK 64-Bit Server VM 21.0.2 on Mac OS X 14.4.1 [info] Apple M1 [info] internal row