HiveStorageHandler.configureTableJobProperites() should let the handler know 
wether it is configuration for input or output
---------------------------------------------------------------------------------------------------------------------------

                 Key: HIVE-2773
                 URL: https://issues.apache.org/jira/browse/HIVE-2773
             Project: Hive
          Issue Type: Improvement
            Reporter: Francis Liu


HiveStorageHandler.configureTableJobProperties() is called to allow the storage 
handler to setup any properties that the underlying 
inputformat/outputformat/serde may need. But the handler implementation does 
not know whether it is being called for configuring input or output. This makes 
it a problem for handlers which sets an external state. In the case of 
HCatalog's HBase storageHandler, whenever a write needs to be configured we 
create a write transaction which needs to be committed or aborted later on. In 
this case configuring for both input and output each time 
configureTableJobProperties() is called would not be desirable. This has become 
an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler 
(see HCATALOG-237).

My proposal is to replace configureTableJobProperties() with two methods:

configureInputJobProperties()
configureOutputJobProperties()

Each method will have the same signature. I cursory look at the code and I 
believe changes should be straighforward also given that we are not really 
changing anything just splitting responsibility. If the community is fine with 
this approach I will go ahead and create a aptch.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to