szehon-ho commented on a change in pull request #4255:
URL: https://github.com/apache/iceberg/pull/4255#discussion_r827424145
##########
File path: docs/spark/spark-queries.md
##########
@@ -168,7 +168,9 @@ To inspect a table's history, snapshots, and other
metadata, Iceberg supports me
Metadata tables are identified by adding the metadata table name after the
original table name. For example, history for `db.table` is read using
`db.table.history`.
{{< hint info >}}
-As of Spark 3.0, the format of the table name for inspection
(`catalog.database.table.metadata`) doesn't work with Spark's default catalog
(`spark_catalog`). If you've replaced the default catalog, you may want to use
`DataFrameReader` API to inspect the table.
+For Spark 2.4, use the `DataFrameReader` API to [inspect
tables](#inspecting-with-dataframes).
+
+For Spark 3, prior to 3.2, the Spark session catalog (`spark_catalog`) does
not support table names with multipart identifiers such as
`catalog.database.table.metadata`. To work around this, for querying metadata
tables, configure a different catalog that uses the Iceberg `SparkCatalog`
class, or use the Spark `DataFrameReader` API. From Spark 3.2 onwards, the
session catalog supports table names with multipart identifiers.
Review comment:
Do we need 'spark_catalog' quote here? It seems a bit confusing, as we
are talking about Spark Session Catalog. It was probably from before, when
mentioning the default catalog. If not needed I feel we can remove it.
##########
File path: docs/spark/spark-queries.md
##########
@@ -168,7 +168,9 @@ To inspect a table's history, snapshots, and other
metadata, Iceberg supports me
Metadata tables are identified by adding the metadata table name after the
original table name. For example, history for `db.table` is read using
`db.table.history`.
{{< hint info >}}
-As of Spark 3.0, the format of the table name for inspection
(`catalog.database.table.metadata`) doesn't work with Spark's default catalog
(`spark_catalog`). If you've replaced the default catalog, you may want to use
`DataFrameReader` API to inspect the table.
+For Spark 2.4, use the `DataFrameReader` API to [inspect
tables](#inspecting-with-dataframes).
+
+For Spark 3, prior to 3.2, the Spark session catalog (`spark_catalog`) does
not support table names with multipart identifiers such as
`catalog.database.table.metadata`. To work around this, for querying metadata
tables, configure a different catalog that uses the Iceberg `SparkCatalog`
class, or use the Spark `DataFrameReader` API. From Spark 3.2 onwards, the
session catalog supports table names with multipart identifiers.
Review comment:
Also I feel like it is still a bit wordy, can we just say
As a workaround, configure the catalog
`org.apache.iceberg.spark.SparkCatalog` or use the DataFrameReader API.
It's more consistent with the previous sentence. I was thinking we dont
need 'for querying metadata tables', as I think that's what this section is
about. And I feel, no need to mention From Spark 3.2 onwards, as it's implied
the issue is gone after 3.2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]