gh-yzou commented on PR #1862:
URL: https://github.com/apache/polaris/pull/1862#issuecomment-3050223421

   @rahil-c sorry, i made my comment yesterday, but forgot to push it. I did a 
push, and added some more comments, please let me know if you have more 
questions about this! 
   As we have discussed, there are two main concerns for this PR:
   1) the hudi dependency introduced for spark client, which is caused by the 
usage of HoodieInternalV2Table. This can be resolved by loading V1Table, and 
then let HudiCatalog loadTable to handle the final table result 
https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/catalog/HoodieCatalog.scala#L123
   2) the extra namespace creation for HudiCatalog. Polaris Spark Client reuses 
the whole Iceberg namespace, ideally we do not want to maintain extra namespace 
creation just for specific table format. The needs of extra namespace creation 
is because HudiCatalog only works with SparkSession Catalog and HiveCatalog 
today 
https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala#L198,
 however, since Polaris is rest catalog, this will not work anymore. We want to 
see if we can push forward on hudi community to improve the catalog 
implementation regarding to the third party catalog plugin. Similar as Delta 
did a special case for unity catalog here 
https://github.com/delta-io/delta/blob/2d89954008b6c53e49744f09435136c5c63b9f2c/spark/src/main/scala/org/apache/spark/sql/delta/catalog/DeltaCatalog.scala#L218
 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to