This looks like the issue around Spark resolving a custom catalog in 0.11 at first glance, based on the code here: https://github.com/apache/iceberg/blob/29cf712a821aa937e176f2d79a5593c4a1429e7f/spark/src/main/java/org/apache/iceberg/actions/BaseSparkAction.java#L138-L170
Could you provide more details of the stack trace beyond BaseSparkAction.loadMetadataTable(BaseSparkAction.java:191)? Also that codebase has changed a lot since 0.11, I would recommend you to try with latest EMR Spark 3.1 version and the newly released Iceberg 0.12.0 to see if the problem persists. Best, Jack Ye On Thu, Aug 19, 2021 at 3:11 PM raghavendra186 <raghu.st...@gmail.com> wrote: > Hi Guys, > > I am working with iceberg 11.1 version iceberg with spark 3.0.1 and when i > run removeOrphanFiles either using Actions or SparkActions class and its > functions it works with hadoop catalog when run locally and i face below > exception when run on EMR with glue catalog. Could you please help me with > what I am missing here? > > code snippet. > > Actions.forTable(table).removeOrphanFiles().olderThan(removeOrphanFilesOlderThan).execute(); > > or > > SparkActions.get().deleteOrphanFiles(table).olderThan(removeOrphanFilesOlderThan).execute(); > > issue (when run on EMR): > > 21/08/19 08:12:56 INFO RemoveOrphanFilesMaintenanceJob: Running > RemoveOrphanFilesMaintenanceJob - removeOrphanFilesOlderThanTimestamp, > Status:Started, tenant: 1, table:raghu3.cars, removeOrphanFilesOlderThan: > {1629360476572}. > > 21/08/19 08:12:56 ERROR RemoveOrphanFilesMaintenanceJob: Error in > RemoveOrphanFilesMaintenanceJob - removeOrphanFilesOlderThanTimestamp, > Illegal Arguments in table properties - Can't parse null value from table > properties, tenant: tenantId1, table: raghu3.cars, > removeOrphanFilesOlderThan: 1629360476572, Status: Failed, Reason: {}. > > java.lang.IllegalArgumentException: Cannot find the metadata table for > glue_catalog.raghu3.cars of type ALL_MANIFESTS > at > org.apache.iceberg.spark.actions.BaseSparkAction.loadMetadataTable(BaseSparkAction.java:191) > at > org.apache.iceberg.spark.actions.BaseSparkAction.buildValidDataFileDF(BaseSparkAction.java:121) > at > org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.doExecute(BaseDeleteOrphanFilesSparkAction.java:154) > at > org.apache.iceberg.spark.actions.BaseSparkAction.withJobGroupInfo(BaseSparkAction.java:101) > at > org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.execute(BaseDeleteOrphanFilesSparkAction.java:141) > at > org.apache.iceberg.spark.actions.BaseDeleteOrphanFilesSparkAction.execute(BaseDeleteOrphanFilesSparkAction.java:76) > at > com.salesforce.cdp.lakehouse.spark.tablemaintenance.job.RemoveOrphanFilesMaintenanceJob.removeOrphanFilesOlderThanTimestamp(RemoveOrphanFilesMaintenanceJob.java:274) > at > com.salesforce.cdp.lakehouse.spark.tablemaintenance.job.RemoveOrphanFilesMaintenanceJob.removeOrphanFiles(RemoveOrphanFilesMaintenanceJob.java:133) > at > com.salesforce.cdp.lakehouse.spark.tablemaintenance.job.RemoveOrphanFilesMaintenanceJob.maintain(RemoveOrphanFilesMaintenanceJob.java:58) > at > com.salesforce.cdp.lakehouse.spark.tablemaintenance.LakeHouseTableMaintenanceJob.run(LakeHouseTableMaintenanceJob.java:117) > at > com.salesforce.cdp.spark.core.job.SparkJob.submitAndRun(SparkJob.java:76) > at > com.salesforce.cdp.lakehouse.spark.tablemaintenance.LakeHouseTableMaintenanceJob.main(LakeHouseTableMaintenanceJob.java:247) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:735) > > > Table does exists > > [image: image.png] > > Did any one face this? What is the fix? Is it a bug or am I missing something > here? > > Thanks, > Raghu >