[
https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027546#comment-16027546
]
Vihang Karajgaonkar commented on HIVE-16771:
--------------------------------------------
The test failure {{udtf_replicate_rows}} is unrelated and working for me when I
run it locally. I ran it twice and it succeeded both the times. I am attaching
one more version with the changes described below. I am hoping that the next
run will succeed for that test.
The new version of the patch closes the connection object from
getMetastoreSchemaVersion method implementation.
Hi [~ngangam] I agree that the interface method should ideally just look like
{{getMetaStoreSchemaVersion()}}. I looked into that possibility but it seems
like in order to achieve that it may need a major refactoring. I think in
general HiveSchemaTool can be made lot more generic which will enable such
seamless plug-and-play design. In order to do that I propose to do following
enhancements to it.
1. I think HiveSchemaTool is in the BeeLine module currently only because it
uses BeeLine to run the queries on metastore. Ideally I think it makes sense to
move HiveSchemaTool to metastore module in the package
{{org.apache.hadoop.hive.metastore.tools}}. How it runs the queries should be
left to the implementations of the interface. If we move it to metastore
package we can potentially just use JDOQL and datanucleus to query the database
like what MetaTool does.
2. In order to do the above we need to make it generic enough so that any
database client should be able to plugged into it to retrieve the results. So
it should only interact with these implementations through an interface
(IMetaStoreSchemaInfo) which should also be in the same package as
HiveSchemaTool.
3. The implementations of the interface however could be user-defined. In case
of Hive we already have the default implementation using BeeLine which we could
keep it in the BeeLine module.
4. Once we do all the above, I think the interface will look a lot more cleaner
as well as the design.
What do you think about these proposals? We can take it up in a separate JIRA
if you think these make sense.
For now, I think the attached patch is reasonably generic enough given that
there lot of cross dependencies between the HiveSchemaTool, BeeLine and
metastore. Can you please review and let me know what you think? Thanks!
> Schematool should use MetastoreSchemaInfo to get the metastore schema version
> from database
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-16771
> URL: https://issues.apache.org/jira/browse/HIVE-16771
> Project: Hive
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Minor
> Attachments: HIVE-16771.01.patch, HIVE-16771.02.patch,
> HIVE-16771.03.patch
>
>
> HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo
> implementation to manage schema upgrades and initialization if needed. In
> order to make HiveSchemaTool completely agnostic it should depend on
> IMetastoreSchemaInfo implementation which is configured to get the metastore
> schema version information from the database. It should also not assume the
> scripts directory and hardcode it itself. It would rather ask
> MetastoreSchemaInfo class to get the metastore scripts directory.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)