Hi, We are in the process of converting Hive datasets to Iceberg datasets.
In this process, we noticed that each data-file entry in the manifest file has a required record_count field. Populating this accurately would require reading the footer/tail for Parquet/ORC files. For AVRO files, it requires reading the block headers for all blocks to determine the number of records in the AVRO file. Is the record_count in the data-file entry expected to be accurate? or can we estimate it based on size of the file and an estimation of a row size? Thanks Vivek