openinx opened a new pull request #1363: URL: https://github.com/apache/iceberg/pull/1363
Before this patch, the `TestFlinkAvroReaderWriter` is designed as the following: 1. Generate `Records` by RandomGenerateData utility, then write them into an avro file; 2. Read those rows into `RowData` list from avro file; 3. Write them into a new avro file again; 4. Read the new avro file into `RowData` and compare the `RowData` list with list from step.2 The problem is: if we `RowData` reader have a bug, then `RowData` list from step.2 will be incorrect, finally it mask the bug from `RowData` writer. Actually, a better way is comparing `RowData` with the generated `Record` list. this patch is trying to enhance the avro unit tests, and confirm that the read/write `RowData` list is consistent with `Records` list (io by generic `Record` reader and writer). BTW, this patch will clear several things in unit tests under flink module: 1. Since the parquet reader has been rewrote in RowData, there's nobody that depends on `Row` now. so we could remove the `RandomData.java` now. 2. As all the three file format reader writer unit tests are extending from `org.apache.iceberg.data.DataTest`, the `DataTest` have defined the `public TemporaryFolder temp = new TemporaryFolder()`, so we won't need to define the 'temp' in subclasses now. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
