snazy opened a new pull request, #1532: URL: https://github.com/apache/polaris/pull/1532
Using `HadoopFileIO` in Polaris can enable "hidden features" that users are likely not aware of. This change requires users to manually update the configuration to be able to use `HadoopFileIO` in way that highlights the consequences of enabling it. This PR updates Polaris in multiple ways: * The default of `SUPPORTED_CATALOG_STORAGE_TYPES` is changed to not include the `FILE` storage type. * Respect the `ALLOW_SPECIFYING_FILE_IO_IMPL` configuration on namespaces, tables and views to prevent setting an `io-impl` value for anything but one of the configured, supported storage-types. * Unify validation code in a new class `IcebergPropertiesValidation`. * Using `FILE` or `HadoopFileIO` now _also_ requires the explicit configuration `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS=true`. * Added production readiness checks that trigger when `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS` is `true` or `SUPPORTED_CATALOG_STORAGE_TYPES` contains `FILE` (defaults and per-realm). * The two new readiness checks are considered _severe_. Severe readiness-errors prevent the server from starting up - unless the user explicitly configured `polaris.readiness.ignore-security-issues=true`. Log messages and configuration options explicitly use "clear" phrases highlighting the consequences. With these changes it is intentionally extremely difficult to start Polaris with HadoopFileIO. People who work around all these safety nets must have realized that what they are doing. A lot of the test code relies on `FILE`/`HadoopFileIO`, those tests got all the configurations to let those tests continue to work as they are, bypassing the added security safeguards. <!-- Possible security vulnerabilities: STOP here and contact secur...@apache.org instead! Please update the title of the PR with a meaningful message - do not leave it "empty" or "generated" Please update this summary field: The summary should cover these topics, if applicable: * the motivation for the change * a description of the status quo, for example the current behavior * the desired behavior * etc PR checklist: - Do a self-review of your code before opening a pull request - Make sure that there's good test coverage for the changes included in this PR - Run tests locally before pushing a PR (./gradlew check) - Code should have comments where applicable. Particularly hard-to-understand areas deserve good in-line documentation. - Include changes and enhancements to the documentation (in site/content/in-dev/unreleased) - For Work In Progress Pull Requests, please use the Draft PR feature. Make sure to add the information BELOW this comment. Everything in this comment will NOT be added to the PR description. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org