rahil-c commented on code in PR #1862:
URL: https://github.com/apache/polaris/pull/1862#discussion_r2180940060


##########
plugins/spark/v3.5/spark/src/main/java/org/apache/polaris/spark/utils/PolarisCatalogUtils.java:
##########
@@ -64,9 +92,13 @@ public static boolean 
isTableWithSparkManagedLocation(Map<String, String> proper
    * Load spark table using DataSourceV2.
    *
    * @return V2Table if DataSourceV2 is available for the table format. For 
delta table, it returns
-   *     DeltaTableV2.
+   *     DeltaTableV2. For hudi it should return HoodieInternalV2Table.
    */
-  public static Table loadSparkTable(GenericTable genericTable) {
+  public static Table loadSparkTable(GenericTable genericTable, Identifier 
identifier) {
+    if (genericTable.getFormat().equalsIgnoreCase("hudi")) {
+      // hudi does not implement table provider interface, so will need to 
catch it

Review Comment:
   Currently the `PolarisCatalogUtils`.`loadSparkTable` is using a 
`DataSourceV2Utils` Util in order to load the table using sparks table provider 
as seen 
[here](https://github.com/apache/polaris/blob/main/plugins/spark/v3.5/spark/src/main/java/org/apache/polaris/spark/utils/PolarisCatalogUtils.java#L86).
   
   In the case for Delta this will go thru Delta's datasource impl which is 
implementing this V2`TableProviderInterface`, see DeltaDataSource 
https://github.com/delta-io/delta/blob/master/spark/src/main/scala/org/apache/spark/sql/delta/sources/DeltaDataSource.scala#L57
   
   The `PolarisCatalogUtils`.`loadSparkTable` currently assumes that other 
formats also implement this same TableProvider interface. However if they do 
not then this will fail with an exception.
   
   Hudi in its spark integration does not implement that interface, see the 
entry point class for hudi datasource entry point `DefaultSource` 
https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala#L55,
 meaning we will have to provide another way to load the hudi table, which is 
why I have added this condition and method called `loadHudiSparkTable`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to