MaxNevermind commented on PR #1335:
URL: https://github.com/apache/parquet-java/pull/1335#issuecomment-2298027081

   @wgtmac 
   
   I started to work on the tests but I can't figure out the current approach 
to ParquetRewriter testing based on already existing tests.  The whole list of 
features I see:
   - data validity after merging
   - single / multiple files merging
   - column nullification
   - column pruning
   - column encryption
   - codec preservation
   - bloom filter preservation
   - page index verification
   - metadata(CREATED_BY_KEY) preservation
   
   I'm used to approach when features are unit tested independently 
sequentially. But looking at existing ParquetRewriter tests I can see that some 
of test tests for multiple things in the same test and I'm not able to figure 
out the system behind mixing features to tests into a single test.  
   
   So how I should approach it, should I just target covering all the features 
in one/two big tests or multiple tests while trying to cover all of those at 
least in one of those?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to