wypoon opened a new pull request, #15765:
URL: https://github.com/apache/iceberg/pull/15765

   `TestRewriteDataFilesAction` has helper `createTable` and 
`createTablePartitioned` methods that create a `Table` instance, write to the 
table, and return the `Table` instance. However, as the write is done using the 
`DataFrame` API and not through the `Table`, the `Table` instance that is 
returned is stale. This is bad design and a trap for unwary new users of the 
methods who want to add tests. The trap is made worse by the helper methods 
`shouldHaveFiles` and `shouldHaveSnapshots` (which take the `Table` as an 
argument) doing a `Table::refresh` before performing its work, so if one looks 
at some existing test method that calls `createTable` followed by 
`shouldHaveFiles` as a model, one does not see the hidden call to 
`Table::refresh`.
   The `Table` should be refreshed before it is returned, so it has all the 
changes. Once this is done, some new `Table::refresh` calls need to be added. 
However, I found many unnecessary `Table::refresh` calls (probably added 
defensively due to being once bitten, twice shy) and have removed them.
   I also found and removed unnecessary `Table::refresh` calls in 
`TestRewritePositionDeleteFilesAction`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to