rdblue commented on a change in pull request #1783:
URL: https://github.com/apache/iceberg/pull/1783#discussion_r539498982
##########
File path:
spark3/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java
##########
@@ -56,48 +83,85 @@ public boolean supportsExternalMetadata() {
}
@Override
- public SparkTable getTable(StructType schema, Transform[] partitioning,
Map<String, String> options) {
- // Get Iceberg table from options
- Configuration conf = SparkSession.active().sessionState().newHadoopConf();
- Table icebergTable = getTableAndResolveHadoopConfiguration(options, conf);
-
- // Build Spark table based on Iceberg table, and return it
- // Eagerly refresh the table before reading to ensure views containing
this table show up-to-date data
- return new SparkTable(icebergTable, schema, true);
+ public Table getTable(StructType schema, Transform[] partitioning,
Map<String, String> options) {
+ String catalogName = extractCatalog(new CaseInsensitiveStringMap(options));
+ Identifier ident = extractIdentifier(new
CaseInsensitiveStringMap(options));
+ CatalogManager catalogManager =
SparkSession.active().sessionState().catalogManager();
+ CatalogPlugin catalog = catalogManager.catalog(catalogName);
+ try {
+ if (catalog instanceof TableCatalog) {
+ return ((TableCatalog) catalog).loadTable(ident);
+ }
+ } catch (NoSuchTableException e) {
+ // throwing an iceberg NoSuchTableException because the Spark one is
typed and cant be thrown from this interface
+ throw new org.apache.iceberg.exceptions.NoSuchTableException(e, "Cannot
find table for %s.", ident);
+ }
+ // throwing an iceberg NoSuchTableException because the Spark one is typed
and cant be thrown from this interface
+ throw new org.apache.iceberg.exceptions.NoSuchTableException("Cannot find
table for %s.", ident);
}
- protected Table findTable(Map<String, String> options, Configuration conf) {
+ private Spark3Util.CatalogAndIdentifier
catalogAndIdentifier(CaseInsensitiveStringMap options) {
Preconditions.checkArgument(options.containsKey("path"), "Cannot open
table: path is not set");
String path = options.get("path");
-
- if (path.contains("/")) {
- HadoopTables tables = new HadoopTables(conf);
- return tables.load(path);
+ SparkSession spark = SparkSession.active();
+ Spark3Util.CatalogAndIdentifier catalogAndIdentifier;
+ try {
+ catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, path);
+ } catch (ParseException e) {
+ List<String> ident = new ArrayList<>();
+ ident.add(path);
+ catalogAndIdentifier = Spark3Util.catalogAndIdentifier(spark, ident);
+ }
+ CatalogManager catalogManager = spark.sessionState().catalogManager();
+ String[] currentNamespace = catalogManager.currentNamespace();
+ // we have to check for paths but want to re-use the exiting utils to
extract catalog/identifier
+ if (checkPathIdentifier(catalogAndIdentifier.identifier(),
currentNamespace)) {
Review comment:
I think it would be simpler just to start this method with a check for
the path:
```
if (path.contains("/")) {
Identifier ident = new PathIdentifer(path);
// do catalog resolution
return catalog, ident;
}
```
It's easier to document and understand if the rules are simple.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]