rdblue commented on pull request #1531:
URL: https://github.com/apache/iceberg/pull/1531#issuecomment-702936438


   Thanks for working on this, @mickjermsurawong-stripe!
   
   I like dynamically loading a location provider, and I like that the changes 
required are pretty small. That looks good to me.
   
   What I'd like to change is how this is tested. There are two main issues 
with it. First, the tests are in an unrelated suite for Spark that is mainly 
testing that Spark works with a variety of different schemas. It doesn't seem 
appropriate to put the tests for injecting a location provider in a different 
module and into this suite. Second, the `TestLocationProvider` returns real 
paths and the tests work by writing data files and validating that files are 
underneath those paths. That does test that it works, but I think this makes 
validation much harder. This approach requires a lot of test code and runs a 
lot of unrelated cases through to validate the location provider.
   
   I would prefer a more targeted test in a new suited in core, 
`TestLocationProvider`. That would have a case that injects a 
`TestLocationProvider` that keeps track of calls in static variables, so you'd 
be able to inject the provider, get it from 
`table.operations().locationProvider()`, call some methods, and validate that 
`TestLocationProvider` received the calls.
   
   We also need negative tests for when the provider can't be found. I'd like 
to have a good error message like "Cannot find location provider class: %s".


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to