zhangdove opened a new issue #1160:
URL: https://github.com/apache/iceberg/issues/1160


   I have some test code use the function of `removeOrphanFiles`, throw 
NoSuchTableException.
   
   ```scala
     case class TwoColumnRecord(id: String, name: String)
   
     def testCode(spark: SparkSession): Unit = {
       val schemaName = "testDb"
       val tableName = "testTb"
   
       val conf: Configuration = new 
Configuration(spark.sparkContext.hadoopConfiguration)
       val catalog: HadoopCatalog = new HadoopCatalog(conf, 
conf.get("fs.defaultFS") + "/iceberg/warehouse")
   
       // 1. create iceberg table by hadoopCatalog
       val nameSpace = Namespace.of(schemaName)
       val tableIdentifier: TableIdentifier = TableIdentifier.of(nameSpace, 
tableName)
       val columns: List[Types.NestedField] = new ArrayList[Types.NestedField]
       columns.add(Types.NestedField.of(1, true, "id", Types.StringType.get, 
"id doc"))
       columns.add(Types.NestedField.of(2, true, "name", Types.StringType.get, 
"name doc"))
       val schema: Schema = new Schema(columns)
       val table: Table = catalog.createTable(tableIdentifier, schema)
   
       // 2. create DataFrame
       val df = spark.createDataFrame(Seq(TwoColumnRecord("1", "iceberg"), 
TwoColumnRecord("2", "spark"))).toDF()
       // 3. write data to iceberg table
       df.write.format("iceberg").mode("append").save(table.location())
       Thread.sleep(1000)
       // 4. write data by parquet to path of data
       df.write.format("parquet").mode("append").save(table.location() + 
"/data/")
   
       // 5. removeOrphanFiles
       Thread.sleep(1000)
       val actions: Actions = Actions.forTable(table)
       val removeFileList = 
actions.removeOrphanFiles().olderThan(System.currentTimeMillis()).execute()
       // throw Exception and exit
     }
   ```
   The expected result is normal exit and delete some orphan files. However, I 
get some error:
   ```java
   Exception in thread "main" 
org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist: 
testDb.testTb
        at 
org.apache.iceberg.BaseMetastoreCatalog.loadMetadataTable(BaseMetastoreCatalog.java:153)
        at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:139)
        at 
org.apache.iceberg.spark.source.IcebergSource.findTable(IcebergSource.java:148)
        at 
org.apache.iceberg.spark.source.IcebergSource.getTableAndResolveHadoopConfiguration(IcebergSource.java:177)
        at 
org.apache.iceberg.spark.source.IcebergSource.createReader(IcebergSource.java:80)
        at 
org.apache.iceberg.spark.source.IcebergSource.createReader(IcebergSource.java:74)
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$SourceHelpers.createReader(DataSourceV2Relation.scala:155)
        at 
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:172)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:204)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
        at 
org.apache.iceberg.actions.RemoveOrphanFilesAction.buildValidDataFileDF(RemoveOrphanFilesAction.java:161)
        at 
org.apache.iceberg.actions.RemoveOrphanFilesAction.execute(RemoveOrphanFilesAction.java:139)
        at com.dove.iceberg.IcebergIssues$.testCode(IcebergIssues.scala:64)
        at com.dove.iceberg.IcebergIssues$.main(IcebergIssues.scala:29)
   ```
   I had checked my hadoop file.
   ```bash
   [root@hadoop39 ~]# hdfs dfs -ls /iceberg/warehouse/testDb/testTb/*/*
   -rw-r--r--   3 hdfs supergroup        645 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/data/00000-0-dba22cf3-b96c-467c-8bf3-01dd7d2f45c6-00000.parquet
   -rw-r--r--   3 hdfs supergroup        630 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/data/00001-1-1c2caf36-bc08-492d-8f1b-cc0fe83795ab-00000.parquet
   -rw-r--r--   3 hdfs supergroup          0 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/data/_SUCCESS
   -rw-r--r--   3 hdfs supergroup        607 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/data/part-00000-7c56f3e9-3b0f-48db-ad77-340ea302074c-c000.snappy.parquet
   -rw-r--r--   3 hdfs supergroup        589 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/data/part-00001-7c56f3e9-3b0f-48db-ad77-340ea302074c-c000.snappy.parquet
   -rw-r--r--   3 hdfs supergroup       4221 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/metadata/1c330540-e731-4804-b1d4-4ace0952ea0a-m0.avro
   -rw-r--r--   3 hdfs supergroup       2544 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/metadata/snap-7685756080210806989-1-1c330540-e731-4804-b1d4-4ace0952ea0a.avro
   -rw-r--r--   3 hdfs supergroup        762 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/metadata/v1.metadata.json
   -rw-r--r--   3 hdfs supergroup       1503 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/metadata/v2.metadata.json
   -rw-r--r--   3 hdfs supergroup          1 2020-07-02 15:39 
/iceberg/warehouse/testDb/testTb/metadata/version-hint.text
   ```
   Iceberg table is created successed and write some data successed.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to