liucao-dd commented on code in PR #16154:
URL: https://github.com/apache/iceberg/pull/16154#discussion_r3398915313
##########
docs/docs/spark-configuration.md:
##########
@@ -199,6 +199,8 @@ val spark = SparkSession.builder()
| spark.sql.iceberg.data-planning-mode | AUTO
| Scan planning mode for data files
(`AUTO`, `LOCAL`, `DISTRIBUTED`)
|
| spark.sql.iceberg.delete-planning-mode | AUTO
| Scan planning mode for delete
files (`AUTO`, `LOCAL`, `DISTRIBUTED`)
|
| spark.sql.iceberg.advisory-partition-size | Table default
| Advisory size (bytes) used for
writing to the Table when Spark's Adaptive Query Execution is enabled. Used to
size output files |
+| spark.sql.iceberg.split-size | Table default
| Overrides `read.split.target-size`
for scan planning. Session values are honored like read options and disable
adaptive split-size adjustment |
+| spark.sql.iceberg.split-size.<table-name> | Global session
default | Table-scoped split size
override using the fully qualified table name as a suffix
|
Review Comment:
Could we consider making table-scoped session configs a generic
identity-first pattern instead of a split-size-specific suffix?
The current shape, `spark.sql.iceberg.split-size.<table-name>`, puts the
setting before the table identity. That works for this one config, but it is
harder to generalize and can be confusing once more session-backed settings
become table-scoped.
An alternative is to resolve table-scoped session configs as:
` spark.sql.iceberg.<catalog>.<namespace...>.<table>.<setting-suffix>`
For this config, if the global key becomes
`spark.sql.iceberg.read.split-size`, the table-scoped key would be:
` spark.sql.iceberg.<catalog>.<namespace...>.<table>.read.split-size`
That keeps the table identity together, then applies the same setting suffix
used by the global session config. It should also avoid reverse-parsing
ambiguity if the key is constructed from the resolved catalog + Spark
`Identifier` and looked up exactly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]