gh-yzou commented on PR #1862: URL: https://github.com/apache/polaris/pull/1862#issuecomment-3050223421
@rahil-c sorry, i made my comment yesterday, but forgot to push it. I did a push, and added some more comments, please let me know if you have more questions about this! As we have discussed, there are two main concerns for this PR: 1) the hudi dependency introduced for spark client, which is caused by the usage of HoodieInternalV2Table. This can be resolved by loading V1Table, and then let HudiCatalog loadTable to handle the final table result https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/catalog/HoodieCatalog.scala#L123 2) the extra namespace creation for HudiCatalog. Polaris Spark Client reuses the whole Iceberg namespace, ideally we do not want to maintain extra namespace creation just for specific table format. The needs of extra namespace creation is because HudiCatalog only works with SparkSession Catalog and HiveCatalog today https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala#L198, however, since Polaris is rest catalog, this will not work anymore. We want to see if we can push forward on hudi community to improve the catalog implementation regarding to the third party catalog plugin. Similar as Delta did a special case for unity catalog here https://github.com/delta-io/delta/blob/2d89954008b6c53e49744f09435136c5c63b9f2c/spark/src/main/scala/org/apache/spark/sql/delta/catalog/DeltaCatalog.scala#L218 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org