Re: [PR] Change bigquery to compileonly [iceberg]
rdblue merged PR #15655: URL: https://github.com/apache/iceberg/pull/15655 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rdblue commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4173119481 Thanks, @rambleraptor! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rdblue commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4173116907 Thanks for posting the dependency validation. I'll follow up with a PR to fix Aliyun. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4172258164 https://gist.github.com/rambleraptor/c16285da277253d71340c70737191435 I took the Spark JAR and got three lists of dependencies: - no BQ (deps-no-bq) - HEAD (deps-pre-15655) - my PR (deps-post-15655) Revision 2 diffs no BQ against my PR, which results in no changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4171764155 Here's a [Gist](https://gist.github.com/rambleraptor/6a39f7e2058521aea1ef12a952daf7ee) showing the iceberg-bigquery JAR before/after this change along with the diff. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4171800172 Here's a [Gist](https://gist.github.com/rambleraptor/6a39f7e2058521aea1ef12a952daf7ee) showing the iceberg-bigquery JAR before/after this change along with the diff. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4171584730 @RussellSpitzer that's nifty! Fun use of cursor. I dumped the list of dependencies before / after this PR into [this doc](https://docs.google.com/document/d/1ieSOpPJKvtDCNBnRerd3XODN9ibVHCG4k-5eKEFBLW8/edit?tab=t.0). The list of dependencies after this PR goes down dramatically. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
RussellSpitzer commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4171571142 Asked Cursor to check out the state of the runtime jars at three different time points before we did the bigquery change, before this pr, and after this pr -- ## Three-Way JAR Comparison (Spark 4.1 Runtime) | | Pre-BigQuery | Pre-#15655 (current) | Post-#15655 (fixed) | |---|---|---|---| | **Size** | 50 MB | **71 MB** | 50 MB | | **Files** | 29,011 | **37,015** | 29,019 | | **Delta vs Pre-BigQuery** | baseline | +21 MB / +8,004 files | +8 files | | **MD5** | `ed7fe76911385b2bb71d0786bae6d057` | `1ab287ee7c480f176b8304fda8d2e3d6` | `8a886148d3218df942e99b52be063183` | ## Pre-BigQuery vs Post-#15655: 8 New Files After PR #15655, the only difference from the pre-BigQuery baseline is the `iceberg-bigquery` module's own compiled classes: ``` org/apache/iceberg/gcp/bigquery/ org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.class org/apache/iceberg/gcp/bigquery/BigQueryMetastoreClient.class org/apache/iceberg/gcp/bigquery/BigQueryMetastoreClientImpl$1.class org/apache/iceberg/gcp/bigquery/BigQueryMetastoreClientImpl.class org/apache/iceberg/gcp/bigquery/BigQueryMetastoreUtils.class org/apache/iceberg/gcp/bigquery/BigQueryProperties.class org/apache/iceberg/gcp/bigquery/BigQueryTableOperations.class ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rdblue commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4170965505 @rambleraptor, can you please dump the dependencies that were shaded into the runtime Jars before this was added and after this commit so we can compare them? If we can't compare them, I'll open a PR to remove the new GCP Jar from the runtime bundles. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
jbonofre commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4157542991 This PR changes what is "shaded" in artifacts (like spark-runtime, etc). I have to check and update the `LICENSE`/ `NOTICE` in artifacts. I can do that in a follow up PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4138407443 @kevinjqliu I'm mostly doing sanity testing by making queries with/without the iceberg-gcp-bundle and expecting failures. If we can query BigQuery without including `iceberg-gcp-bundle`, it means we're taking on dependencies from that bundle unexpectedly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rambleraptor commented on code in PR #15655:
URL: https://github.com/apache/iceberg/pull/15655#discussion_r2997805657
##
build.gradle:
##
@@ -705,10 +706,10 @@ project(':iceberg-bigquery') {
implementation project(path: ':iceberg-bundled-guava', configuration:
'shadow')
-implementation platform(libs.google.libraries.bom)
+compileOnly platform(libs.google.libraries.bom)
compileOnly "com.google.cloud:google-cloud-storage"
-implementation "com.google.cloud:google-cloud-bigquery"
-implementation "com.google.cloud:google-cloud-core"
+compileOnly "com.google.cloud:google-cloud-bigquery"
+compileOnly "com.google.cloud:google-cloud-core"
Review Comment:
I did some sanity testing:
- Ran Spark with `iceberg-bigquery,iceberg-spark-runtime` to query a
BigQuery table. This failed since it required dependencies from
`iceberg-gcp-bundle` (for authentication / others).
- Ran Spark with
`iceberg-bigquery,iceberg-spark-runtime,iceberg-gcp-bundle`. Worked as expected.
This is telling me the bundles are taking on the dependencies we expect them
to.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
rdblue commented on code in PR #15655:
URL: https://github.com/apache/iceberg/pull/15655#discussion_r2991251208
##
build.gradle:
##
@@ -705,10 +706,10 @@ project(':iceberg-bigquery') {
implementation project(path: ':iceberg-bundled-guava', configuration:
'shadow')
-implementation platform(libs.google.libraries.bom)
+compileOnly platform(libs.google.libraries.bom)
compileOnly "com.google.cloud:google-cloud-storage"
-implementation "com.google.cloud:google-cloud-bigquery"
-implementation "com.google.cloud:google-cloud-core"
+compileOnly "com.google.cloud:google-cloud-bigquery"
+compileOnly "com.google.cloud:google-cloud-core"
Review Comment:
Can you confirm that this works with the current Google cloud bundle that is
produced?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Change bigquery to compileonly [iceberg]
kevinjqliu commented on PR #15655: URL: https://github.com/apache/iceberg/pull/15655#issuecomment-4071484676 is there a way to validate this? similar to ryans comment here from the original pr https://github.com/apache/iceberg/pull/14221#issuecomment-3407070906 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
