[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765518#comment-13765518 ] Viraj Bhat commented on HIVE-4331: -- Thanks Ashutosh Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.13.0 Attachments: HIVE4331_07-17.patch, HIVE-4331.1.patch, HIVE-4331.2.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765054#comment-13765054 ] Hive QA commented on HIVE-4331: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12602669/HIVE-4331.1.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/701/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/701/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-701/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1522097. At revision 1522097. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, HIVE-4331.1.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765126#comment-13765126 ] Viraj Bhat commented on HIVE-4331: -- New attachment should get rid of the above error message from the PRE-COMMIT build. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, HIVE-4331.1.patch, HIVE-4331.2.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763368#comment-13763368 ] Sushanth Sowmyan commented on HIVE-4331: Yup. I do have a primary motivation in getting this patch into 0.12 because we've already gone ahead and deprecated the HBaseHCatStorageHandler under the assumption that we'll make this in. I will work with Viraj to make sure this makes it in. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763360#comment-13763360 ] Olga Natkovich commented on HIVE-4331: -- Hi Sushanth, Thanks for your feedback. Can we agree to commit the patches as is with the understanding that we will address the remaining concerns after wards. The patches are large and we already spend a lot of time adjusting andre-basing them. Given that things change pretty rapidly, we just want to make sure we do not need to rebase yet again. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763341#comment-13763341 ] Sushanth Sowmyan commented on HIVE-4331: Viraj, I'm okay with the hive-side. The HCatalog side, I do have some comments to go over after the recent refactor. For now, I'm going to go ahead and make this bug easier to commit by splitting it up into two sub-tasks. That way, we can at least get a commit of the PTOF without further delay. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763629#comment-13763629 ] Sushanth Sowmyan commented on HIVE-4331: As an update, after a bit of thinking about how backward compatibility should, I have no further issues or comments on the HCatalog side of the patch, I'm +1 on it. There is work to be done there, but not relevant directly to this patch. It belongs more into an addendum for HIVE-5233 than it does over here. I'm +1 on the HCat side (barring some minor spacing issues, which I can fix before committing) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761654#comment-13761654 ] Viraj Bhat commented on HIVE-4331: -- Hi Sushanth, Thanks for your offer for help. I went over the trunk code of HCat and modified the HCat's storage handler so that they reflect in the org.apache.hcatalog set. There are something else I observed in the trunk, such as: 1) Most of the pom files have 0.12.0-SNAPSHOT, so I changed it to reflect 0.13.0-SNAPSHOT 2) Had to modify the HCatStorageHandler.java in the org.apache.hcatalog.mapreduce package as it had an annotation: * @deprecated Use/modify {@link org.apache.hive.hcatalog.mapreduce.HCatStorageHandler} instead to get past the compile as we do not have this class in the org.apache.hive.hcatalog Let me post the new patch and wait for the pre-commit build status. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761786#comment-13761786 ] Hive QA commented on HIVE-4331: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12602102/HIVE-4331.patch {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 3098 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteRead org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteRead org.apache.hive.hcatalog.mapreduce.TestHCatExternalHCatNonPartitioned.testHCatNonPartitionedTable org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteReadMR org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1 org.apache.hive.hcatalog.pig.TestHCatStorerMulti.testStoreBasicTable org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteReadMR org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_smb_mapjoin_8 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/669/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/669/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762281#comment-13762281 ] Brock Noland commented on HIVE-4331: It's because the HCatalog tests are flaky. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762272#comment-13762272 ] Viraj Bhat commented on HIVE-4331: -- Hi Ashutosh and Sushanth, All unit tests corresponding to the Hive's HBaseCliDriver and the (new added tests and old ones) storage handler of HCat pass. I am not sure why the PRE-COMMIT build is showing the following 7 failure/skipped tests for HCat. In fact on my desktop I get no failures for HCat. Most of them are originating in the TestSequenceFileReadWrite TestSequenceFileReadWrite.testSequenceTableWriteRead - test case passes TestSequenceFileReadWrite.testTextTableWriteRead - test cases passes TestSequenceFileReadWrite.testTextTableWriteReadMR - test cases passes TestSequenceFileReadWrite.testSequenceTableWriteReadMR - test case passes TestHCatExternalHCatNonPartitioned.testHCatNonPartitionedTable - test cases passes on my desktop TestHCatStorerMulti.testStoreBasicTable - test case passes TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable - test case passes on my desktop On the Hive there are 2 unit test failures, TestMinimrCliDriver.testCliDriver_smb_mapjoin_8 and TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1 which might not be related to this patch. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762720#comment-13762720 ] Viraj Bhat commented on HIVE-4331: -- Hi Brock, thanks, I was looking for a blue build, seems that it will not happen. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762723#comment-13762723 ] Viraj Bhat commented on HIVE-4331: -- Ashutosh and Sushanth, I would really appreciate if you we could move forward with this patch and commit it to trunk and branch 0.12 Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760941#comment-13760941 ] Viraj Bhat commented on HIVE-4331: -- Sushanth, it seems that: HIVE-5233 and HIVE-5236 were applied around a few hours back I created the patch. So it might be possible the patch did not apply. So I will try applying these patches on my trunk and try recreating again. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760969#comment-13760969 ] Viraj Bhat commented on HIVE-4331: -- Sushanth I see that there are 2 directories under hcatalog trunk: https://github.com/apache/hive/tree/trunk/hcatalog/core/src/main/java/org/apache/hive/hcatalog https://github.com/apache/hive/tree/trunk/hcatalog/core/src/main/java/org/apache/hcatalog Should my patch apply to both these directories? Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760754#comment-13760754 ] Viraj Bhat commented on HIVE-4331: -- Sushanth and Ashutosh updated the patch to reflect the comments. Also attaching a copy of the integrated patch on the Jira. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760778#comment-13760778 ] Viraj Bhat commented on HIVE-4331: -- Attaching a unified patch as HIVE-JIRA.patch for the precommit build to run. Regards Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760887#comment-13760887 ] Hive QA commented on HIVE-4331: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12601928/HIVE-4331.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/652/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/652/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-652/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Ubuild.properties Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1520721. Updated to revision 1520721. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Fix For: 0.12.0 Attachments: HIVE4331_07-17.patch, hive4331hcatrebase.patch, HIVE-4331.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759782#comment-13759782 ] Sushanth Sowmyan commented on HIVE-4331: [~viraj] : The HIVE-4869 freeze is over, and you can generate the patch for the hcatalog side from trunk now. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759937#comment-13759937 ] Viraj Bhat commented on HIVE-4331: -- Sushanth, thanks for informing. I am now regenerating the patch and running the unit tests. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13758575#comment-13758575 ] Viraj Bhat commented on HIVE-4331: -- Sushanth, most of the issues stated in phabricator have been addressed in the patch I will reattach if HIVE-4869 goes through. Meanwhile I will wait for your go as to when HIVE-4869 gets committed. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13757352#comment-13757352 ] Sushanth Sowmyan commented on HIVE-4331: One quick note - the HCatalog directory will be frozen tonight, to implement the changes with HIVE-4869. Since this patch touches every single file in hcatalog (package name change), you will have to regenerate the hcatalog patch after that. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HBase Handler, HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751697#comment-13751697 ] Sushanth Sowmyan commented on HIVE-4331: I've gone through the HCat section of the patch as well, and I'm okay with it. I give that my +1 as well, pending the commit of the hive side of the patch, and making sure that checkstyle passes the changes (My guess, right now, is that with word-wrapping the Apache license headers, it won't) -- The only changes needed, on my end, are on the hive side, and are minor code-style and implementation issues, such as using raw strings, and using a map where there wasn't a need for one. If those are addressed, this patch is good to go. (Also, the whole patch will need a rebase to be applicable to trunk right now) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751700#comment-13751700 ] Viraj Bhat commented on HIVE-4331: -- Sushanth thanks for going through the HCat patch. I am currently implementing the changes in Hive and rebasing it. I will post it as soon as I am done. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750364#comment-13750364 ] Francis Liu commented on HIVE-4331: --- {quote} Francis, you're welcome to go ahead and review it as well. I have been looking at it as well, along with testing, although I waited on commenting till I'd finished with the hive-side review. I'll comment on both of them before Monday. {quote} Cool. It'd be good to have a reviewer that looks at both pieces. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750367#comment-13750367 ] Olga Natkovich commented on HIVE-4331: -- Hi Sushanth, Were you able to review this patch? Thanks! Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750371#comment-13750371 ] Sushanth Sowmyan commented on HIVE-4331: I left some more comments on the hive patch phabricator review. Apart from a couple of minor code-style comments, the majority of my feeling about this is that I'm okay with what HivePTOF is doing, and it's a good first step, and I will +1 it, but I think that it does not go far enough along to address being able to use any generic MR OutputFormat with hive. That said, I agree that that would not really be possible unless we plumb out how HiveOutputFormat is used itself, and I'm not completely certain people have a need for that. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750396#comment-13750396 ] Sushanth Sowmyan commented on HIVE-4331: I would love to see SH become a first class entry in hive, and HOF be a kind of SH, leading to its eventual removal. That's precisely what my long-term goal for this is. and I'm not completely certain people have a need for that By this bit, I meant that I wasn't sure people had a need for doing away with HOF (other than for code-cleanliness, which is why I would like to see it gone) being able to use any generic OF with hive - most of the usecases of traditional M/R OFs are already covered by hive, or for newer formats being developed, the OF writer winds up making changes so that it is hive compatible, such as with orc, or with the HBase SH. So unless there were a major push to see a BlahOutputFormat that is widely used, but was not already usable from within Hive, I don't see there being a necessity case for a change in hive that I want. :) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750501#comment-13750501 ] Sushanth Sowmyan commented on HIVE-4331: Oh, and one more thing: {quote} With this StorageHandlers can use generic OFs. {quote} This assertion is incorrect. A more precise assertion would be that with this patch, Hive can use generic OFs that do not do anything useful or necessary in their outputcommitters(i.e. do not need calls on them to be made). If an OF is designed with an outputcommitter in mind, chances are that it will need some retrofitting before it will work from within hive. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750510#comment-13750510 ] Sushanth Sowmyan commented on HIVE-4331: As it currently stands, SHes *need* , as a prerequisite, to be able to work at a partition level, and have outputcommitter concepts before it can be used more widely. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750508#comment-13750508 ] Francis Liu commented on HIVE-4331: --- {quote} most of the usecases of traditional M/R OFs are already covered by hive, or for newer formats being developed, the OF writer winds up making changes so that it is hive compatible, such as with orc, or with the HBase SH {quote} Yes but ideally they don't really need HOF to do that. {quote} So unless there were a major push to see a BlahOutputFormat that is widely used, but was not already usable from within Hive, I don't see there being a necessity case for a change in hive that I want. {quote} Yep, which is why we want to do it incrementally. Letting it leak into SH and hcat code would make the idea of cleaning things up less appealing. I think if we just started using SH for new OFs and not use HOF, these pieces would naturally go into this state. Having said that it'd be nice if Orc could be moved to using storage handlers. It would also help SH mature. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750523#comment-13750523 ] Ashutosh Chauhan commented on HIVE-4331: [~toffer] One more thing popped up in my discussion with Sushanth. Another issue which is in this general area of Hive usage of OF is w.r.t OutputCommitter. Hive currently explicitly disables OC and performs lot of logic (which folks usually write in OC) from client side. Architecturally, its cleaner for Hive to migrate these things from client to OC and make OC first class citizen. For folks who has OF which has OC, it will be easier to integrate that in Hive, instead of understanding Hive innards and handling of OC. Wondering if you have given a thought on this? I just want to make sure if and when we go that route these current changes won't get in the way. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750526#comment-13750526 ] Francis Liu commented on HIVE-4331: --- Sigh there's the rub. Are there Jira's for this? Would be great to keep track of this in case someone would like to do it. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750545#comment-13750545 ] Francis Liu commented on HIVE-4331: --- {quote} For folks who has OF which has OC, it will be easier to integrate that in Hive, instead of understanding Hive innards and handling of OC. Wondering if you have given a thought on this? I just want to make sure if and when we go that route these current changes won't get in the way. {quote} For HCat we already do it this way. It's not really just the OC but the OF,OC,RR in general. HOF essentially is doing the Hive specific stuff that the plain OC, RR, etc can do as well. So I don't think we changed the complexity of the work needed to support new formats? Is that what you meant by get in the way? In the long run it'd be better since HCat and Hive treat OFs the same way. Though it'd be great to document what that contract (beyond the typical OF) is. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750579#comment-13750579 ] Ashutosh Chauhan commented on HIVE-4331: No HCat and Hive dont treat OFs in same way. This difference of OF handling is a reason why HCatOF couldn't be used from Hive, another being HCat uses mapreduce api while Hive uses mapred api. If we can make Hive use HCatOF that will be a win, but thats yet another topic. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750598#comment-13750598 ] Francis Liu commented on HIVE-4331: --- {quote} No HCat and Hive dont treat OFs in same way. This difference of OF handling is a reason why HCatOF couldn't be used from Hive, another being HCat uses mapreduce api while Hive uses mapred api. If we can make Hive use HCatOF that will be a win, but thats yet another topic. {quote} Currently they don't mainly because of HOF but they behave in almost the same way else this whole interoperability story is broken. With this patch they'll at least be closer when it comes to dealing with OFs that don't use HOF. Instead of having to mirror that behavior. Actually AFAIK only the HCatOF wrapper classes uses mapreduce and the underlying stuff deals with mapred which we did as part of the StorageDriver-SerDe migration. So it'd be relatively easy to support a mapred version of HCatOF. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750620#comment-13750620 ] Sushanth Sowmyan commented on HIVE-4331: [~ashutoshc] : The mapreduce-mapred transition is not a tough one by itself(apart from losing abort/commit semantics, which need to be emulated). The Outputcommitter concept is quite a bit more involved though. For now, I'm still +1 on this patch(the hive side) because it moves us one step closer to that goal. I don't think we should require the OC concept being brought in as well for it to be committed, I brought that up because I wanted to highlight that we aren't done with this patch, and more work is required past it, and this patch alone does not make generic OFs workable from within Hive. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749094#comment-13749094 ] Francis Liu commented on HIVE-4331: --- It seems no one has reviewed the HCat patch yet. I can go ahead and do the review if everyone is ok with it given that I worked on the design and reviewed the code internally? Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749101#comment-13749101 ] Viraj Bhat commented on HIVE-4331: -- Hi Francis, Thanks for your help. Regards Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749108#comment-13749108 ] Sushanth Sowmyan commented on HIVE-4331: Francis, you're welcome to go ahead and review it as well. I have been looking at it as well, along with testing, although I waited on commenting till I'd finished with the hive-side review. I'll comment on both of them before Monday. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749118#comment-13749118 ] Viraj Bhat commented on HIVE-4331: -- Sushanth, thanks for your help. I will finish modifying the Hive patch till then. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747686#comment-13747686 ] Ashutosh Chauhan commented on HIVE-4331: I see what you are saying and I agree HiveOF interface is burdensome. What I am wondering is if we can morph HiveOFImpl into something which can serve this requirement, so that we don't have plethora of classes trying to wrap OFs in Hive at various point in lifecycle. How about we make HiveOFImpl implement HiveOF and than add implementation for getHiveRR and getRR in it in similar fashion in whats in proposed HivePTOF is? Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747756#comment-13747756 ] Francis Liu commented on HIVE-4331: --- {quote} How about we make HiveOFImpl implement HiveOF and than add implementation for getHiveRR and getRR in it in similar fashion in whats in proposed HivePTOF is? {quote} I don't think that's possible since HiveOFImpl is not used the same way as HiveOF is. HiveOFImpl is the actual OF that's passed to the job and is essentially a wrapper for the different concrete FileSinkOperators that's part of a given stage. The FileSinkOperators in turn contain a concrete instance of HiveOF. Also it would be meaningless to have HiveOFImpl implement HiveOF given that the consumer of HiveOFImpl is the MR framework which has no notion of HiveOF. Maybe I'm missing something here? I'd be happy to discuss this with you offline to help this patch get moving. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747964#comment-13747964 ] Ashutosh Chauhan commented on HIVE-4331: [~toffer] I see what you are saying. I have also asked [~sushanth] to review Hive pieces to get an alternate opinion. [~viraj] I left some implementation comments on phabricator. Assuming [~sushanth] is OK with patch, those are all the comments I have. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747976#comment-13747976 ] Viraj Bhat commented on HIVE-4331: -- Hi Ashutosh, Thanks for all your help. Let me start addressing the comments in Hive. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745136#comment-13745136 ] Francis Liu commented on HIVE-4331: --- Ashutosh, I don't think what you're suggesting will work. Since HiveOFImpl is merely a container for initializing real OFs that are contained in defined FileSinkOps. Looking at HIVE-2764, HiveOFImpl was introduced to guarantee checkOutputSpecs was called on each real OF so that OFs such as HBase has a chance of adding a delegation token. It seems the HiveOFImpl name is misleading since it doesn't extend HiveOF and is not used in the same way. Let me know if I missed anything. The requirement that a StorageHandler's OF has to extend HiveOF pervasive all over the hive code. ie A lot of classes have HiveOutputFormat (tableDesc, FileSinkOP, etc) as a member variable instead of OF. So it's not possible to just change this expectation at the end point of the code path. With storage handlers I don't see why we still need HiveOF. We should migrate all supported OFs to use storage handlers and remove HiveOF. HivePassThru is an step in that direction. StorageHandlers will no longer have to extend HiveOF and HiveOF does not have to leak into HCat code to get interoperability working. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13744012#comment-13744012 ] Ashutosh Chauhan commented on HIVE-4331: HIVE-2764 introduced HiveOutputFormatImpl for similar requirements for HBaseStorgagehandler. I wonder if we can reuse that here instead of introducing new HivePassthroughFormat. Did you consider reusing that class (and/or enhancing it) to meet new requirements. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732780#comment-13732780 ] Viraj Bhat commented on HIVE-4331: -- Hi Ashutosh, I have created 2 review requests one which changes files in the HCatalog contrib and the other in Hive. Hope this helps in the review process. Hive: https://reviews.facebook.net/D12063 HCatalog: https://reviews.facebook.net/D12069 Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722984#comment-13722984 ] Ashutosh Chauhan commented on HIVE-4331: [~viraj] If its not too hard can you separate current patch into two issues : one dealing with HivePassThroughFormat and second about merging storage handlers. Seems like first is a pre-requisite for second. I want to understand that change little better since that may have implications for other storage handler writers and output format writers for Hive. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723023#comment-13723023 ] Ashutosh Chauhan commented on HIVE-4331: If I am reading this right, is the motivation of HivePassThroughFormat is that storage handler writers need not to write their custom OF (to implement Hive OF) anymore and thus can use their existing OF unmodified and all the necessary plumbing will be done in Storage handler ? Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723029#comment-13723029 ] Viraj Bhat commented on HIVE-4331: -- Hi Ashutosh, That is right,. They do not need to write their CustomOF to implement HiveOF. If it makes it easier to review, I can split the patch based on HCat (contrib) and core Hive. Viraj Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723055#comment-13723055 ] Ashutosh Chauhan commented on HIVE-4331: Yeah.. it will make it easier for review. Division in terms of core Hive and HCatalog sounds good. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: HIVE4331_07-17.patch, StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4331) Integrated StorageHandler for Hive and HCat using the HiveStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706460#comment-13706460 ] Rajesh Balamohan commented on HIVE-4331: This will be extremely beneficial for lots of usecases involving Hive, HBase, HCatalog and Pig. Especially one can think of hosting frequently changed data in HBase and access it in Hive/Pig/MapReduce via HCatalog. Integrated StorageHandler for Hive and HCat using the HiveStorageHandler Key: HIVE-4331 URL: https://issues.apache.org/jira/browse/HIVE-4331 Project: Hive Issue Type: Task Components: HCatalog Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Viraj Bhat Attachments: StorageHandlerDesign_HIVE4331.pdf 1) Deprecate the HCatHBaseStorageHandler and RevisionManager from HCatalog. These will now continue to function but internally they will use the DefaultStorageHandler from Hive. They will be removed in future release of Hive. 2) Design a HivePassThroughFormat so that any new StorageHandler in Hive will bypass the HiveOutputFormat. We will use this class in Hive's HBaseStorageHandler instead of the HiveHBaseTableOutputFormat. 3) Write new unit tests in the HCat's storagehandler so that systems such as Pig and Map Reduce can use the Hive's HBaseStorageHandler instead of the HCatHBaseStorageHandler. 4) Make sure all the old and new unit tests pass without backward compatibility (except known issues as described in the Design Document). 5) Replace all instances of the HCat source code, which point to HCatStorageHandler to use theHiveStorageHandler including the FosterStorageHandler. I have attached the design document for the same and will attach a patch to this Jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira