[GitHub] [iceberg] openinx opened a new pull request #1363: Flink: Enhance the flink reader-writer unit tests by asserting between record and rowdata

GitBox Thu, 20 Aug 2020 23:48:54 -0700


openinx opened a new pull request #1363:
URL: https://github.com/apache/iceberg/pull/1363



   Before this patch, the `TestFlinkAvroReaderWriter` is designed as the 
following: 
   1.    Generate `Records` by RandomGenerateData utility, then write them into 
an avro file;
   2.   Read those rows into `RowData` list  from avro file; 
   3.   Write them into a new avro file again; 
   4.   Read the new avro file into `RowData`  and compare the `RowData` list  
with list from step.2
   
   The problem is:  if we `RowData` reader have a bug, then `RowData` list from 
step.2 will be incorrect, finally it mask the bug from `RowData` writer. 
   
   Actually,  a better way is comparing `RowData` with the generated `Record` 
list.  this patch is trying to enhance the avro unit tests,  and confirm that 
the read/write `RowData` list is consistent with `Records` list (io by generic 
`Record` reader and writer).
   
   BTW, this patch will clear several things in unit tests under flink module: 
   1.   Since the parquet reader has been rewrote in RowData, there's nobody 
that depends on `Row` now.  so we could remove the `RandomData.java` now. 
   2.  As all the three file format reader writer unit tests are extending from 
`org.apache.iceberg.data.DataTest`,  the `DataTest` have defined the `public 
TemporaryFolder temp = new TemporaryFolder()`,  so we won't need to define the 
'temp' in subclasses now.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx opened a new pull request #1363: Flink: Enhance the flink reader-writer unit tests by asserting between record and rowdata

Reply via email to