rdblue commented on pull request #1478:
URL: https://github.com/apache/iceberg/pull/1478#issuecomment-698033854


   Thanks for working on this, @marton-bod! And sorry for the delay in replying 
on this. I was initially focused on trying to avoid the problem of needing both 
hive 2 and hive 3 modules, but I don't see a way around it because the OI 
interfaces now specify incompatible objects. So I agree that we will need 
additional modules to handle this.
   
   But, I think there are some ways to simplify the changes this introduces. 
Because the changes needed between 2 and 3 are minor, I think the goal should 
be to produce a single iceberg-hive-runtime Jar that works in both versions. To 
do that, we need to build a `DateObjectInspector` for Hive 2 and one for Hive 
3, but return the correct one at runtime using reflection. That way, we detect 
whether to use Hive 2 or 3 (e.g., by checking if `DateWritableV2` exists) and 
return an OI that matches. The other class would not be loaded, so it would not 
cause any issues.
   
   I think that we can achieve this using just one new module, iceberg-hive3, 
that adds the new object inspectors. The other module could continue to depend 
on Hive 2.
   
   I'd like to avoid selecting `versions.props` based on flags, so we would 
ideally just embed the Hive 3 version in the iceberg-hive3 dependency. That 
module should also depend on the iceberg-mr module so that it can run the same 
tests (maybe it can set the test source directory to share with hive2?). This 
module would have both hive 2 and hive 3 classes in its test classpath, so it 
would validate that having both doesn't break Hive. And, since Hadoop it always 
pulled in as a test dependency, it can use Hadoop 3.
   
   Finally, the iceberg-hive-runtime module would pull in both iceberg-hive and 
iceberg-hive3 so that all of the classes are in our runtime module.
   
   I think this would greatly simplify the support:
   1. It would add only one new module
   2. It would avoid using build flags
   
   What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to