[jira] [Commented] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729135#comment-13729135 ] Hive QA commented on HIVE-4870: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12595856/HIVE-4870.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2759 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/303/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/303/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4970) BinaryConverter does not respect nulls
[ https://issues.apache.org/jira/browse/HIVE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729370#comment-13729370 ] Hudson commented on HIVE-4970: -- ABORTED: Integrated in Hive-trunk-h0.21 #2245 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2245/]) HIVE-4970 BinaryConverter does not respect null (Mark Wagner via egc) Submitted by: Mark Wagner Reviewed by: Edward Capriolo (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510263) * /hive/trunk/ql/src/test/queries/clientpositive/ba_table_udfs.q * /hive/trunk/ql/src/test/results/clientpositive/ba_table_udfs.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java * /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java BinaryConverter does not respect nulls -- Key: HIVE-4970 URL: https://issues.apache.org/jira/browse/HIVE-4970 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Mark Wagner Assignee: Mark Wagner Fix For: 0.12.0 Attachments: HIVE-4970.1.patch, HIVE-4970.2.patch Right now, the BinaryConverter in PrimitiveObjectInspectorConverter does not handle null values the same as the other converters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4879) Window functions that imply order can only be registered at compile time
[ https://issues.apache.org/jira/browse/HIVE-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729369#comment-13729369 ] Hudson commented on HIVE-4879: -- ABORTED: Integrated in Hive-trunk-h0.21 #2245 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2245/]) Hive-4879 Window function that imply order can only be registered at compile time (Edward Capriolo) Reviewed by:Brock Noland (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510269) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFFirstValue.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLastValue.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLead.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java Window functions that imply order can only be registered at compile time Key: HIVE-4879 URL: https://issues.apache.org/jira/browse/HIVE-4879 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.12.0 Attachments: HIVE-4879.1.patch.txt, HIVE-4879.2.patch.txt, HIVE-4879.3.patch.txt, HIVE-4879.4.patch.txt Adding an annotation for impliesOrder -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Access to trigger jobs on jenkins
Hi, Are you looking to trigger the pre-commit builds? Unfortunately to trigger *regular* builds you'd need an Apache username according the Apache Infra Jenkins http://wiki.apache.org/general/Jenkinspage. Brock On Sun, Aug 4, 2013 at 1:37 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Hello, I was wondering if it is possible to get access to be able to trigger jobs on the jenkins server? Or is that access limited to committers? Thanks, -- Swarnim -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-4992) add ability to skip javadoc during build
[ https://issues.apache.org/jira/browse/HIVE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729490#comment-13729490 ] Brock Noland commented on HIVE-4992: Hey this looks good! I think that hcat generates javadoc in a separate build.xml file. Can we add this to the hcat build as well? add ability to skip javadoc during build Key: HIVE-4992 URL: https://issues.apache.org/jira/browse/HIVE-4992 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HIVE-4992.D11967.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Access to trigger jobs on jenkins
Hi Brock, Yes I was looking to trigger the pre-commit builds without having to check-in a new patch everytime to auto-trigger them. I assumed they were similar to the *regular* builds? On Mon, Aug 5, 2013 at 7:43 AM, Brock Noland br...@cloudera.com wrote: Hi, Are you looking to trigger the pre-commit builds? Unfortunately to trigger *regular* builds you'd need an Apache username according the Apache Infra Jenkins http://wiki.apache.org/general/Jenkins page. Brock On Sun, Aug 4, 2013 at 1:37 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Hello, I was wondering if it is possible to get access to be able to trigger jobs on the jenkins server? Or is that access limited to committers? Thanks, -- Swarnim -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Swarnim
Re: Access to trigger jobs on jenkins
Hi, The precommit builds are similar to regular builds but I was curious as I agree it'd be nice to allow people to re-trigger the precommit build who may not have access to the Apache Jenkins. Let's think about that. For now I'd just re-upload the same patch. Brock On Mon, Aug 5, 2013 at 8:35 AM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Hi Brock, Yes I was looking to trigger the pre-commit builds without having to check-in a new patch everytime to auto-trigger them. I assumed they were similar to the *regular* builds? On Mon, Aug 5, 2013 at 7:43 AM, Brock Noland br...@cloudera.com wrote: Hi, Are you looking to trigger the pre-commit builds? Unfortunately to trigger *regular* builds you'd need an Apache username according the Apache Infra Jenkins http://wiki.apache.org/general/Jenkins page. Brock On Sun, Aug 4, 2013 at 1:37 PM, kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com wrote: Hello, I was wondering if it is possible to get access to be able to trigger jobs on the jenkins server? Or is that access limited to committers? Thanks, -- Swarnim -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- Swarnim -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Created] (HIVE-4997) HCatalog doesn't allow multiple input tables
Daniel Intskirveli created HIVE-4997: Summary: HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Attachment: HIVE-4388.patch TestE2EScenarios was not using a Shim. I have fixed this. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Affects Version/s: 0.12.0 Status: Patch Available (was: Open) Patch includes a new class, HCatMultipleInputs, which supports multiple table names, database names, partition filters, and mapper classes. HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Open (was: Patch Available) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: HIVE-4997.patch HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Patch Available (was: Open) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729596#comment-13729596 ] Hive QA commented on HIVE-4388: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596124/HIVE-4388.patch {color:green}SUCCESS:{color} +1 2759 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/304/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/304/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Problem loading new UDTF in local hive copy
Hi, I'm trying to compile hive with a new UDTF and have been following the wiki instruction (https://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy). I've added my new function to the function registry and have successfully updated show_functions.q.out. However, when I recompile and start my local copy of hive with build/dist/bin/hive the show functions; command is still not listing my new function. Any thoughts on what I'm missing? Sorry if this is a naive question. Thanks for your help, Niko
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: HIVE-4997.patch1 HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch, HIVE-4997.patch1 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Open (was: Patch Available) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch, HIVE-4997.patch1 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Patch Available (was: Open) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch, HIVE-4997.patch1 HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Attachment: HIVE-4388.patch Simplified one ant condition. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4879) Window functions that imply order can only be registered at compile time
[ https://issues.apache.org/jira/browse/HIVE-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729670#comment-13729670 ] Hudson commented on HIVE-4879: -- ABORTED: Integrated in Hive-trunk-hadoop2 #328 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/328/]) Hive-4879 Window function that imply order can only be registered at compile time (Edward Capriolo) Reviewed by:Brock Noland (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510269) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionDescription.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCumeDist.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFDenseRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFFirstValue.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLastValue.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLead.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFRank.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java Window functions that imply order can only be registered at compile time Key: HIVE-4879 URL: https://issues.apache.org/jira/browse/HIVE-4879 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.12.0 Attachments: HIVE-4879.1.patch.txt, HIVE-4879.2.patch.txt, HIVE-4879.3.patch.txt, HIVE-4879.4.patch.txt Adding an annotation for impliesOrder -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4970) BinaryConverter does not respect nulls
[ https://issues.apache.org/jira/browse/HIVE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729671#comment-13729671 ] Hudson commented on HIVE-4970: -- ABORTED: Integrated in Hive-trunk-hadoop2 #328 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/328/]) HIVE-4970 BinaryConverter does not respect null (Mark Wagner via egc) Submitted by: Mark Wagner Reviewed by: Edward Capriolo (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510263) * /hive/trunk/ql/src/test/queries/clientpositive/ba_table_udfs.q * /hive/trunk/ql/src/test/results/clientpositive/ba_table_udfs.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java * /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java BinaryConverter does not respect nulls -- Key: HIVE-4970 URL: https://issues.apache.org/jira/browse/HIVE-4970 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Mark Wagner Assignee: Mark Wagner Fix For: 0.12.0 Attachments: HIVE-4970.1.patch, HIVE-4970.2.patch Right now, the BinaryConverter in PrimitiveObjectInspectorConverter does not handle null values the same as the other converters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4683) fix coverage org.apache.hadoop.hive.cli
[ https://issues.apache.org/jira/browse/HIVE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729669#comment-13729669 ] Hudson commented on HIVE-4683: -- ABORTED: Integrated in Hive-trunk-hadoop2 #328 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/328/]) HIVE-4683 : fix coverage org.apache.hadoop.hive.cli (Aleksey Gorshkov via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510346) * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/RCFileCat.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestCliDriverMethods.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestCliSessionState.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestOptionsProcessor.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestRCFileCat.java fix coverage org.apache.hadoop.hive.cli --- Key: HIVE-4683 URL: https://issues.apache.org/jira/browse/HIVE-4683 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Aleksey Gorshkov Assignee: Aleksey Gorshkov Fix For: 0.12.0 Attachments: HIVE-4683-branch-0.10.patch, HIVE-4683-branch-0.10-v1.patch, HIVE-4683-branch-0.11-v1.patch, HIVE-4683-trunk.patch, HIVE-4683-trunk-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729665#comment-13729665 ] Edward Capriolo commented on HIVE-4964: --- With a quick look. 1) Your not using the correct formatting rules for the project. We can not accept code that does not match the coding conventions {code} +while (pItr.hasNext()) +{ + Object oRow = pItr.next(); + forward(oRow, outputObjInspector); +} {code} 2) The implementing class should not be on the left side of the equals. Use List not ArrayList when possible. {code} ArrayListObjectInspector fieldOIs = new ArrayListObjectInspector(); {code} Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Tez branch and tez based patches
Which talk are you referencing here? AFAIK all the Hive code we've written is being pushed back into the Tez branch, so you should be able to see it there. Alan. On Jul 29, 2013, at 9:02 PM, Edward Capriolo wrote: At ~25:00 There is a working prototype of hive which is using tez as the targeted runtime Can I get a look at that code? Is it on github? Edward On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates ga...@hortonworks.com wrote: Answers to some of your questions inlined. Alan. On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote: There are some points I want to bring up. First, I am on the PMC. Here is something I find relevant: http://www.apache.org/foundation/how-it-works.html -- The role of the PMC from a Foundation perspective is oversight. The main role of the PMC is not code and not coding - but to ensure that all legal issues are addressed, that procedure is followed, and that each and every release is the product of the community as a whole. That is key to our litigation protection mechanisms. Secondly the role of the PMC is to further the long term development and health of the community as a whole, and to ensure that balanced and wide scale peer review and collaboration does happen. Within the ASF we worry about any community which centers around a few individuals who are working virtually uncontested. We believe that this is detrimental to quality, stability, and robustness of both code and long term social structures. https://blogs.apache.org/comdev/entry/what_makes_apache_projects_different - All other decisions happen on the dev list, discussions on the private list are kept to a minimum. If it didn't happen on the dev list, it didn't happen - which leads to: a) Elections of committers and PMC members are published on the dev list once finalized. b) Out-of-band discussions (IRC etc.) are summarized on the dev list as soon as they have impact on the project, code or community. - https://issues.apache.org/jira/browse/HIVE-4660 ironically titled Let their be Tez has not be +1 ed by any committer. It was never discussed on the dev or the user list (as far as I can tell). As all JIRA creations and updates are sent to dev@hive, creating a JIRA is de facto posting to the list. As a PMC member I feel we need more discussion on Tez on the dev list along with a wiki-fied design document. Topics of discussion should include: I talked with Gunther and he's working on posting a design doc on the wiki. He has a PDF on the JIRA but he doesn't have write permissions yet on the wiki. 1) What is tez? In Hadoop 2.0, YARN opens up the ability to have multiple execution frameworks in Hadoop. Hadoop apps are no longer tied to MapReduce as the only execution option. Tez is an effort to build an execution engine that is optimized for relational data processing, such as Hive and Pig. The biggest change here is to move away from only Map and Reduce as processing options and to allow alternate combinations of processing, such as map - reduce - reduce or tasks that take multiple inputs or shuffles that avoid sorting when it isn't needed. For a good intro to Tez, see Arun's presentation on it at the recent Hadoop summit (video http://www.youtube.com/watch?v=9ZLLzlsz7h8 slides http://www.slideshare.net/Hadoop_Summit/murhty-saha-june26255pmroom212) 2) How is tez different from oozie, http://code.google.com/p/hop/, http://cs.brown.edu/~backman/cmr.html , and other DAG and or streaming map reduce tools/frameworks? Why should we use this and not those? Oozie is a completely different thing. Oozie is a workflow engine and a scheduler. It's core competencies are the ability to coordinate workflows of disparate job types (MR, Pig, Hive, etc.) and to schedule them. It is not intended as an execution engine for apps such as Pig and Hive. I am not familiar with these other engines, but the short answer is that Tez is built to work on YARN, which works well for Hive since it is tied to Hadoop. 3) When can we expect the first tez release? I don't know, but I hope sometime this fall. 4) How much effort is involved in integrating hive and tez? Covered in the design doc. 5) Who is ready to commit to this effort? I'll let people speak for themselves on that one. 6) can we expect this work to be done in one hive release? Unlikely. Initial integration will be done in one release, but as Tez is a new project I expect it will be adding features in the future that Hive will want to take advantage of. In my opinion we should not start any work on this tez-hive until these questions are answered to the satisfaction of the hive developers. Can we change this to not commit patches? We can't tell willing people not to work on it.
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Status: Open (was: Patch Available) Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Attachment: (was: HIVE-4870.patch) Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Tez branch and tez based patches
On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote: Also watched http://www.ustream.tv/recorded/36323173 I definitely see the win in being able to stream inter-stage output. I see some cases where small intermediate results can be kept In memory. But I was somewhat under the impression that the map reduce spill settings kept stuff in memory, isn't that what spill settings are? No. MapReduce always writes shuffle data to local disk. And intermediate results between MR jobs are always persisted to HDFS, as there's no other option. When we talk of being able to keep intermediate results in memory we mean getting rid of both of these disk writes/reads when appropriate (meaning not always, there's a trade off between speed and error handling to be made here, see below for more details). There is a few bullet points that came up repeatedly that I do not follow: Something was said to the effect of Container reuse makes X faster. Hadoop has jvm reuse. Not following what the difference is here? Not everyone has a 10K node cluster. Sharing JVMs across users is inherently insecure (we can't guarantee what code the first user left behind that may interfere with later users). As I understand container re-use in Tez it constrains the re-use to one user for security reasons, but still avoids additional JVM start up costs. But this is a question that the Tez guys could answer better on the Tez lists (d...@tez.incubator.apache.org) Joins in map reduce are hard Really? I mean some of them are I guess, but the typical join is very easy. Just shuffle by the join key. There was not really enough low level details here saying why joins are better in tez. Join is not a natural operation in MapReduce. MR gives you one input and one output. You end up having to bend the rules to do have multiple inputs. The idea here is that Tez can provide operators that naturally work with joins and other operations that don't fit the one input/one output model (eg unions, etc.). Chosing the number of maps and reduces is hard Really? I do not find it that hard, I think there are times when it's not perfect but I do not find it hard. The talk did not really offer anything here technical on how tez makes this better other then it could make it better. Perhaps manual would be a better term here than hard. In our experience it takes quite a bit of engineer trial and error to determine the optimal numbers. This may be ok if you're going to invest the time once and then run the same query every day for 6 months. But obviously it doesn't work for the ad hoc case. Even in the batch case it's not optimal because every once and a while an engineer has to go back and re-optimize the query to deal with changing data sizes, data characteristics, etc. We want the optimizer to handle this without human intervention. The presentations mentioned streaming data, how do two nodes stream data between a tasks and how it it reliable? If the sender or receiver dies does the entire process have to start again? If the sender or receiver dies then the query has to be restarted from some previous point where data was persisted to disk. The idea here is that speed vs error recovery trade offs should be made by the optimizer. If the optimizer estimates that a query will complete in 5 seconds it can stream everything and if a node fails it just re-runs the whole query. If it estimates that a particular phase of a query will run for an hour it can choose to persist the results to HDFS so that in the event of a failure downstream the long phase need not be re-run. Again we want this to be done automatically by the system so the user doesn't need to control this level of detail. Again one of the talks implied there is a prototype out there that launches hive jobs into tez. I would like to see that, it might answer more questions then a power point, and I could profile some common queries. As mentioned in a previous email afaik Gunther's pushed all these changes to the Tez branch in Hive. Alan. Random late night thoughts over, Ed On Tue, Jul 30, 2013 at 12:02 AM, Edward Capriolo edlinuxg...@gmail.comwrote: At ~25:00 There is a working prototype of hive which is using tez as the targeted runtime Can I get a look at that code? Is it on github? Edward On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates ga...@hortonworks.com wrote: Answers to some of your questions inlined. Alan. On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote: There are some points I want to bring up. First, I am on the PMC. Here is something I find relevant: http://www.apache.org/foundation/how-it-works.html -- The role of the PMC from a Foundation perspective is oversight. The main role of the PMC is not code and not coding - but to ensure that all legal issues are addressed, that procedure is followed, and that each and every
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-4388: --- Component/s: HBase Handler HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12827: HIVE-4611 - SMB joins fail based on bigtable selection policy.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12827/ --- (Updated Aug. 5, 2013, 5:57 p.m.) Review request for hive, Ashutosh Chauhan, Brock Noland, and Gunther Hagleitner. Changes --- Addressed Gunther's comments. Bugs: HIVE-4611 https://issues.apache.org/jira/browse/HIVE-4611 Repository: hive-git Description --- SMB joins fail based on bigtable selection policy. The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 12e9334 ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java cc9de54 ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java 5320143 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java db5ff0f ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java db3c9e7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java cd1b4ad ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java b882f87 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java 3071713 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java e214807 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java da5115b ql/src/test/queries/clientnegative/auto_sortmerge_join_1.q c858254 ql/src/test/queries/clientpositive/auto_sortmerge_join_15.q PRE-CREATION ql/src/test/results/clientnegative/auto_sortmerge_join_1.q.out 0eddb69 ql/src/test/results/clientnegative/smb_bucketmapjoin.q.out 7a5b8c1 ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out PRE-CREATION Diff: https://reviews.apache.org/r/12827/diff/ Testing --- All tests pass on hadoop 1. Thanks, Vikram Dixit Kumaraswamy
[jira] [Updated] (HIVE-4611) SMB joins fail based on bigtable selection policy.
[ https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-4611: - Attachment: HIVE-4611.5.patch.txt Addressed Gunther's comments. SMB joins fail based on bigtable selection policy. -- Key: HIVE-4611 URL: https://issues.apache.org/jira/browse/HIVE-4611 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.11.1 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, HIVE-4611.5.patch.txt, HIVE-4611.patch The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Attachment: HIVE-4870.patch Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.
[ https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729726#comment-13729726 ] Vikram Dixit K commented on HIVE-4611: -- I deleted that test because it is no longer a negative test. I moved it to positive tests. SMB joins fail based on bigtable selection policy. -- Key: HIVE-4611 URL: https://issues.apache.org/jira/browse/HIVE-4611 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.11.1 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, HIVE-4611.5.patch.txt, HIVE-4611.patch The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Status: Patch Available (was: Open) Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12690: HIVE-4870: Explain Extended to show partition info for Fetch Task
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12690/ --- (Updated Aug. 5, 2013, 6:04 p.m.) Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch adds Partition Description info to Fetch Task. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 65c39d6 ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 0e8f96b ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 42e25fa ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 47a8635 ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c39d057 ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out bd7381f ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 6121722 ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e0cd848 ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 924fbad ql/src/test/results/clientpositive/bucketcontext_1.q.out 62910fb ql/src/test/results/clientpositive/bucketcontext_2.q.out 0857c9d ql/src/test/results/clientpositive/bucketcontext_3.q.out 69dc2b2 ql/src/test/results/clientpositive/bucketcontext_4.q.out 0d79901 ql/src/test/results/clientpositive/bucketcontext_7.q.out 19ea4fa ql/src/test/results/clientpositive/bucketcontext_8.q.out 9a7aaa0 ql/src/test/results/clientpositive/bucketmapjoin1.q.out 307132b ql/src/test/results/clientpositive/bucketmapjoin10.q.out 1a6bc06 ql/src/test/results/clientpositive/bucketmapjoin11.q.out bd9b1fe ql/src/test/results/clientpositive/bucketmapjoin12.q.out fc161a9 ql/src/test/results/clientpositive/bucketmapjoin13.q.out 30d8925 ql/src/test/results/clientpositive/bucketmapjoin2.q.out ebbb2ba ql/src/test/results/clientpositive/bucketmapjoin3.q.out 66918b6 ql/src/test/results/clientpositive/bucketmapjoin7.q.out 8105ba4 ql/src/test/results/clientpositive/bucketmapjoin8.q.out 92c74a9 ql/src/test/results/clientpositive/bucketmapjoin9.q.out b7aec66 ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 2d803db ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4b8bd14 ql/src/test/results/clientpositive/join32.q.out 92d81b9 ql/src/test/results/clientpositive/join32_lessSize.q.out 82b3e4a ql/src/test/results/clientpositive/join33.q.out 92d81b9 ql/src/test/results/clientpositive/sort_merge_join_desc_6.q.out f6aae06 ql/src/test/results/clientpositive/sort_merge_join_desc_7.q.out dbce51a ql/src/test/results/clientpositive/stats11.q.out 9a5be33 ql/src/test/results/clientpositive/union22.q.out bec39f4 Diff: https://reviews.apache.org/r/12690/diff/ Testing --- All the hive unit tests passed. Thanks, John Pullokkaran
[jira] [Updated] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-4123: - Tags: orc, rle, encoding Fix Version/s: 0.12.0 Labels: orcfile (was: ) Affects Version/s: 0.12.0 Status: Patch Available (was: Open) Making patch available. The RLE encoding for ORC can be improved Key: HIVE-4123 URL: https://issues.apache.org/jira/browse/HIVE-4123 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Prasanth J Labels: orcfile Fix For: 0.12.0 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, HIVE-4123.3.patch.txt, HIVE-4123.4.patch.txt, ORC-Compression-Ratio-Comparison.xlsx The run length encoding of integers can be improved: * tighter bit packing * allow delta encoding * allow longer runs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4989) Consolidate and simplify vectorization code and test generation
[ https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729745#comment-13729745 ] Tony Murphy commented on HIVE-4989: --- This patch should be good to go. HIVE-4971 covers the testVectorUDFUnixTimeStampLong failure. Consolidate and simplify vectorization code and test generation --- Key: HIVE-4989 URL: https://issues.apache.org/jira/browse/HIVE-4989 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Assignee: Tony Murphy Fix For: vectorization-branch Attachments: HIVE-4989-vectorization.patch The current code generation is unwieldy to use and prone to errors. This change consolidates all the code and test generation into a single location, and removes the need to manually place files which can lead to missing or incomplete code or tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 13274: Consolidate and simplify vectorization code and test generation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13274/ --- Review request for hive, Eric Hanson, Jitendra Pandey, Remus Rusanu, and Sarvesh Sakalanaga. Bugs: HIVE-4989 https://issues.apache.org/jira/browse/HIVE-4989 Repository: hive-git Description --- The current code generation is unwieldy to use and prone to errors. This change consolidates all the code and test generation into a single location, and removes the need to manually place files which can lead to missing or incomplete code or tests. New usage: From ql\src\gen\vectorization: javac org\apache\hadoop\hive\ql\exec\vector\gen*.java java org.apache.hadoop.hive.ql.exec.vector.gen.CodeGen Additionally, I've fixed some incomplete\broken test generations. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java a4c1999 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/ColumnArithmeticColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/ColumnArithmeticScalar.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/ColumnCompareScalar.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/ColumnUnaryMinus.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterColumnCompareColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterColumnCompareScalar.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterScalarCompareColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringColumnCompareColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringColumnCompareScalar.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringScalarCompareColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/ScalarArithmeticColumn.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestClass.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestCodeGen.java 34c093c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestColumnColumnFilterVectorExpressionEvaluation.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestColumnColumnOperationVectorExpressionEvaluation.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestColumnScalarFilterVectorExpressionEvaluation.txt 5b53d6a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/TestColumnScalarOperationVectorExpressionEvaluation.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFMinMax.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFMinMaxString.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFSum.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFVar.txt ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestColumnColumnOperationVectorExpressionEvaluation.java dd2cc09 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/TestColumnScalarFilterVectorExpressionEvaluation.java 6baa444 Diff: https://reviews.apache.org/r/13274/diff/ Testing --- Thanks, tony murphy
[jira] [Updated] (HIVE-4995) select * may incorrectly return empty fields with hbase-handler
[ https://issues.apache.org/jira/browse/HIVE-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-4995: --- Attachment: HIVE-4995.1.patch.txt select * may incorrectly return empty fields with hbase-handler --- Key: HIVE-4995 URL: https://issues.apache.org/jira/browse/HIVE-4995 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-4995.1.patch.txt, HIVE-4995.1.patch.txt HIVE-3725 added capability to pull hbase columns with prefixes. However the way the current logic to add columns stands in HiveHBaseTableInput format, it might cause some columns to incorrectly display empty fields. Consider the following query: {noformat} CREATE EXTERNAL TABLE test_table(key string, value1 mapstring,string, value2 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf-a:prefix.*,cf-a:another_col) TBLPROPERTIES (hbase.table.name = test_table); {noformat} Given the existing logic in HiveHBaseTableInputFormat: {code} for (int i = 0; i columnsMapping.size(); i++) { ColumnMapping colMap = columnsMapping.get(i); if (colMap.hbaseRowKey) { continue; } if (colMap.qualifierName == null) { scan.addFamily(colMap.familyNameBytes); } else { scan.addColumn(colMap.familyNameBytes, colMap.qualifierNameBytes); } } {code} So for the above query, the 'addFamily' will be called first followed by 'addColumn' for the column family cf-a. This will wipe away whatever we had set with the 'addFamily' call in the previous step resulting in an empty column when queried. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4573) Support alternate table types for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729763#comment-13729763 ] Thejas M Nair commented on HIVE-4573: - bq. Although it seems that eventually (.13?) you would want the default to be CLASSIC If we don't set the default to CLASSIC sooner, it would never happen. As time goes by, more applications would start relying this behavior. As [~the6campbells] points out, the CLASSIC behavior is documented to be the 'normal' behavior. While we should aim for backward compatibility, I am not sure if that applies to bugs as well. The managed vs external table information can certainly be very useful. It would be good to get that without changing the server configuration. Should we rely on something like 'describe table extended' for that ? While I don't agree on the default, I don't think perfect should get in way of good. This improves things by making the classic behavior possible. We can discuss the default in a separate jira. +1 for the patch. Support alternate table types for HiveServer2 - Key: HIVE-4573 URL: https://issues.apache.org/jira/browse/HIVE-4573 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Johndee Burks Assignee: Prasad Mujumdar Priority: Minor Attachments: HIVE-4573.1.patch, HIVE-4573.2.patch The getTables jdbc function no longer returns information when using normal JDBC table types like TABLE or VIEW. You must now use a more specific type such as MANAGED_TABLE or VIRTUAL_VIEW. An example application that will fail to return results against 0.10 is below, works without issue in 0.9. In my 0.10 test I used HS2. {code} import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; import org.apache.hive.jdbc.HiveDriver; import java.sql.DatabaseMetaData; public class TestGet { private static String driverName = org.apache.hive.jdbc.HiveDriver; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection(jdbc:hive2://hostname:1/default); DatabaseMetaData dbmd = con.getMetaData(); String[] types = {TABLE}; ResultSet rs = dbmd.getTables(null, null, %, types); while (rs.next()) { System.out.println(rs.getString(TABLE_NAME)); } } } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4573) Support alternate table types for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729771#comment-13729771 ] Edward Capriolo commented on HIVE-4573: --- External tables are exceedingly rare and commonly misunderstood. I do not understand why a driver would care if a table was external or managed, that is just an implementation detail. Support alternate table types for HiveServer2 - Key: HIVE-4573 URL: https://issues.apache.org/jira/browse/HIVE-4573 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Johndee Burks Assignee: Prasad Mujumdar Priority: Minor Attachments: HIVE-4573.1.patch, HIVE-4573.2.patch The getTables jdbc function no longer returns information when using normal JDBC table types like TABLE or VIEW. You must now use a more specific type such as MANAGED_TABLE or VIRTUAL_VIEW. An example application that will fail to return results against 0.10 is below, works without issue in 0.9. In my 0.10 test I used HS2. {code} import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; import org.apache.hive.jdbc.HiveDriver; import java.sql.DatabaseMetaData; public class TestGet { private static String driverName = org.apache.hive.jdbc.HiveDriver; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection(jdbc:hive2://hostname:1/default); DatabaseMetaData dbmd = con.getMetaData(); String[] types = {TABLE}; ResultSet rs = dbmd.getTables(null, null, %, types); while (rs.next()) { System.out.println(rs.getString(TABLE_NAME)); } } } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4995) select * may incorrectly return empty fields with hbase-handler
[ https://issues.apache.org/jira/browse/HIVE-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729768#comment-13729768 ] Brock Noland commented on HIVE-4995: +1 select * may incorrectly return empty fields with hbase-handler --- Key: HIVE-4995 URL: https://issues.apache.org/jira/browse/HIVE-4995 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-4995.1.patch.txt, HIVE-4995.1.patch.txt HIVE-3725 added capability to pull hbase columns with prefixes. However the way the current logic to add columns stands in HiveHBaseTableInput format, it might cause some columns to incorrectly display empty fields. Consider the following query: {noformat} CREATE EXTERNAL TABLE test_table(key string, value1 mapstring,string, value2 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf-a:prefix.*,cf-a:another_col) TBLPROPERTIES (hbase.table.name = test_table); {noformat} Given the existing logic in HiveHBaseTableInputFormat: {code} for (int i = 0; i columnsMapping.size(); i++) { ColumnMapping colMap = columnsMapping.get(i); if (colMap.hbaseRowKey) { continue; } if (colMap.qualifierName == null) { scan.addFamily(colMap.familyNameBytes); } else { scan.addColumn(colMap.familyNameBytes, colMap.qualifierNameBytes); } } {code} So for the above query, the 'addFamily' will be called first followed by 'addColumn' for the column family cf-a. This will wipe away whatever we had set with the 'addFamily' call in the previous step resulting in an empty column when queried. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729777#comment-13729777 ] Hive QA commented on HIVE-4388: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596151/HIVE-4388.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 2759 tests executed *Failed tests:* {noformat} org.apache.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteReadMR org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/307/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/307/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails
[ https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729784#comment-13729784 ] Arup Malakar commented on HIVE-4343: This patch causes compilation failure when compiled against hadoop 20. I tried _ant clean package -Dhadoop.mr.rev=20_ {code} [echo] Project: ql [javac] Compiling 904 source files to /Users/malakar/code/oss/hive/build/ql/classes [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/exec/SecureCmdDoAs.java:51: cannot find symbol [javac] symbol : variable HADOOP_TOKEN_FILE_LOCATION [javac] location: class org.apache.hadoop.security.UserGroupInformation [javac] env.put(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION, [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 1 error {code} HS2 with kerberos- local task for map join fails Key: HIVE-4343 URL: https://issues.apache.org/jira/browse/HIVE-4343 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4343.1.patch, HIVE-4343.2.patch, HIVE-4343.3.patch With hive server2 configured with kerberos security, when a (map) join query is run, it results in failure with GSSException: No valid credentials provided -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4992) add ability to skip javadoc during build
[ https://issues.apache.org/jira/browse/HIVE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4992: -- Attachment: HIVE-4992.D11967.2.patch sershe updated the revision HIVE-4992 [jira] add ability to skip javadoc during build. Also change hcatalog/build.xml Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11967 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11967?vs=36885id=36993#toc AFFECTED FILES build.xml hcatalog/build.xml To: JIRA, sershe add ability to skip javadoc during build Key: HIVE-4992 URL: https://issues.apache.org/jira/browse/HIVE-4992 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HIVE-4992.D11967.1.patch, HIVE-4992.D11967.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails
[ https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729800#comment-13729800 ] Thejas M Nair commented on HIVE-4343: - [~amalakar] HIVE-4991 has fix for the 0.20 build issue caused by this patch. HS2 with kerberos- local task for map join fails Key: HIVE-4343 URL: https://issues.apache.org/jira/browse/HIVE-4343 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4343.1.patch, HIVE-4343.2.patch, HIVE-4343.3.patch With hive server2 configured with kerberos security, when a (map) join query is run, it results in failure with GSSException: No valid credentials provided -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails
[ https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729805#comment-13729805 ] Arup Malakar commented on HIVE-4343: [~thejas]Thanks for the update. Saw the patch in HIVE-4991. HS2 with kerberos- local task for map join fails Key: HIVE-4343 URL: https://issues.apache.org/jira/browse/HIVE-4343 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4343.1.patch, HIVE-4343.2.patch, HIVE-4343.3.patch With hive server2 configured with kerberos security, when a (map) join query is run, it results in failure with GSSException: No valid credentials provided -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4991) hive build with 0.20 is broken
[ https://issues.apache.org/jira/browse/HIVE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729808#comment-13729808 ] Arup Malakar commented on HIVE-4991: I see the following error complaining about an import in HiveSessionImpl.java I see that HiveSessionImpl doesn't use the import. Removing the import fixed the problem for me. {code} [echo] Project: service [javac] Compiling 144 source files to /Users/malakar/code/oss/hive/build/service/classes [javac] /Users/malakar/code/oss/hive/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java:29: package org.apache.commons.io does not exist [javac] import org.apache.commons.io.FileUtils; [javac] ^ [javac] Note: /Users/malakar/code/oss/hive/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 1 error {code} hive build with 0.20 is broken -- Key: HIVE-4991 URL: https://issues.apache.org/jira/browse/HIVE-4991 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Edward Capriolo Priority: Blocker Labels: newbie Attachments: HIVE-4991.2.patch.txt, HIVE-4991.patch.txt As reported in HIVE-4911 ant clean package -Dhadoop.mr.rev=20 Fails with - {code} compile: [echo] Project: ql [javac] Compiling 898 source files to /Users/malakar/code/oss/hive/build/ql/classes [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35: package org.apache.commons.io does not exist [javac] import org.apache.commons.io.FileUtils; [javac] ^ [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743: cannot find symbol [javac] symbol : variable FileUtils [javac] location: class org.apache.hadoop.hive.ql.session.SessionState [javac] FileUtils.deleteDirectory(resourceDir); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12824/ --- (Updated Aug. 5, 2013, 6:54 p.m.) Review request for hive. Changes --- Rebased. Bugs: HIVE-4911 https://issues.apache.org/jira/browse/HIVE-4911 Repository: hive-git Description --- The QoP for hive server 2 should be configurable to enable encryption. A new configuration should be exposed hive.server2.thrift.rpc.protection. This would give greater control configuring hive server 2 service. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 555343ebffb9dcd5e58d5b99ce9ca52904f68ecf conf/hive-default.xml.template f01e715e4de95b4011210143f7d3add2d8a4d432 jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 00f43511b478c687b7811fc8ad66af2b507a3626 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java cde58c25991641573453217da71a7ac1acf6adfd metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java cef50f40ccb047a8135f704b2997968a2cf477b8 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 88151a1d48b12cf3a8346ae94b6d1a182a331992 service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 1809e1b26ceee5de14a354a0e499aa8c0ab793bf service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 379dafb8377aed55e74f0ae18407996bb9e1216f service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1df6993cb9aac1bb195667b3123faee27d657c0a shims/src/common-secure/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 3e850ec3991cbb2d4343969ba8fe9df4a7d137b5 shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java ab7f5c0eb5345e68e3f223c9dfed8414de946661 Diff: https://reviews.apache.org/r/12824/diff/ Testing --- Thanks, Arup Malakar
[jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved
[ https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729816#comment-13729816 ] Hive QA commented on HIVE-4123: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594072/HIVE-4123.4.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/308/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/308/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-308/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ant/src/org/apache/hadoop/hive/ant/antlib.xml' Reverted 'hbase-handler/ivy.xml' Reverted 'hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestSetup.java' Reverted 'hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java' Reverted 'hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java' Reverted 'hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableOutputFormat.java' Reverted 'hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java' Reverted 'hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java' Reverted 'build.xml' Reverted 'ivy/libraries.properties' Reverted 'hcatalog/core/build.xml' Reverted 'hcatalog/pom.xml' Reverted 'hcatalog/build.properties' Reverted 'hcatalog/build.xml' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/TestRevisionManager.java' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/snapshot/TestRevisionManagerEndpoint.java' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/ManyMiniCluster.java' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseDirectOutputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseBulkOutputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/test/org/apache/hcatalog/hbase/TestHBaseInputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/TableSnapshot.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/RevisionManagerProtocol.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/Transaction.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/RevisionManager.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/RevisionManagerEndpointClient.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/RevisionManagerEndpoint.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/snapshot/ZKBasedRevisionManager.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/ImportSequenceFile.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HbaseSnapshotRecordReader.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HBaseHCatStorageHandler.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HBaseBaseOutputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HBaseDirectOutputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HBaseBulkOutputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/src/java/org/apache/hcatalog/hbase/HBaseInputFormat.java' Reverted 'hcatalog/storage-handlers/hbase/pom.xml' Reverted 'hcatalog/build-support/ant/build-common.xml' Reverted 'hcatalog/build-support/ant/deploy.xml' Reverted 'hcatalog/build-support/ant/checkstyle.xml' Reverted 'hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestE2EScenarios.java' Reverted 'build-common.xml' Reverted '.gitignore' Reverted 'ql/ivy.xml' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf build ant/src/org/apache/hadoop/hive/ant/SetSystemProperty.java hbase-handler/src/java/org/apache/hadoop/hive/hbase/PutWritable.java
[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport
[ https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arup Malakar updated HIVE-4911: --- Attachment: 20-build-temp-change-1.patch HIVE-4911-trunk-3.patch I used 20-build-temp-change-1.patch to compile against 20. [~thejas] Let me know if you have any comments. Enable QOP configuration for Hive Server 2 thrift transport --- Key: HIVE-4911 URL: https://issues.apache.org/jira/browse/HIVE-4911 Project: Hive Issue Type: New Feature Reporter: Arup Malakar Assignee: Arup Malakar Attachments: 20-build-temp-change-1.patch, 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch The QoP for hive server 2 should be configurable to enable encryption. A new configuration should be exposed hive.server2.thrift.rpc.protection. This would give greater control configuring hive server 2 service. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4998) support jdbc documented table types in default configuration
Thejas M Nair created HIVE-4998: --- Summary: support jdbc documented table types in default configuration Key: HIVE-4998 URL: https://issues.apache.org/jira/browse/HIVE-4998 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Thejas M Nair The jdbc table types supported by hive server2 are not the documented typical types [1] in jdbc, they are hive specific types (MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW). HIVE-4573 added support for the jdbc documented typical types, but the HS2 default configuration is to return the hive types The default configuration should result in the expected jdbc typical behavior. [1] http://docs.oracle.com/javase/6/docs/api/java/sql/DatabaseMetaData.html?is-external=true#getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4999) Shim class HiveHarFileSystem does not have a hadoop2 counterpart
Brock Noland created HIVE-4999: -- Summary: Shim class HiveHarFileSystem does not have a hadoop2 counterpart Key: HIVE-4999 URL: https://issues.apache.org/jira/browse/HIVE-4999 Project: Hive Issue Type: Sub-task Reporter: Brock Noland HiveHarFileSystem only exists in the 0.20 shim. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4988) HCatalog Pig Adapter test does not compile under hadoop2
[ https://issues.apache.org/jira/browse/HIVE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland resolved HIVE-4988. Resolution: Duplicate HCatalog Pig Adapter test does not compile under hadoop2 Key: HIVE-4988 URL: https://issues.apache.org/jira/browse/HIVE-4988 Project: Hive Issue Type: Sub-task Reporter: Brock Noland {noformat} compile-test: [echo] hcatalog-pig-adapter [mkdir] Created dir: /home/brock/workspaces/hive-apache/hive/hcatalog/hcatalog-pig-adapter/build/test/classes [javac] Compiling 14 source files to /home/brock/workspaces/hive-apache/hive/hcatalog/hcatalog-pig-adapter/build/test/classes [javac] /home/brock/workspaces/hive-apache/hive/hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestE2EScenarios.java:196: org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be instantiated [javac] TaskAttemptContext rtaskContext = new TaskAttemptContext(conf , taskId ); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 error {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5000) hive.optimize.skewjoin can cause long running queries to fail
Brock Noland created HIVE-5000: -- Summary: hive.optimize.skewjoin can cause long running queries to fail Key: HIVE-5000 URL: https://issues.apache.org/jira/browse/HIVE-5000 Project: Hive Issue Type: Bug Reporter: Brock Noland Priority: Minor {noformat} MapReduce Total cumulative CPU time: 5 days 19 hours 7 minutes 8 seconds 540 msec Ended Job = job_201301311513_15328 java.io.FileNotFoundException: File hdfs://:8020/tmp/hive-scripts/hive_2013-02-06_10-23-17_026_1520760778337129611/ -mr-10002/hive_skew_join_bigkeys_0 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:406) at org.apache.hadoop.hive.ql.plan.ConditionalResolverSkewJoin.getTasks(ConditionalResolverSkewJoin.java:96) at org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1331) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:439) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:449) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:700) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Ended Job = -1079843427, job is filtered out (removed at runtime). 8390065 Rows loaded to hdfs:///tmp/hive-scripts/hive_2013-02-06_10-23-17_026_1520760778337129611/-ext-1 MapReduce Jobs Launched: Job 0: Map: 970 Reduce: 260 Cumulative CPU: 500828.54 sec HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 5 days 19 hours 7 minutes 8 seconds 540 msec OK {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: (was: HIVE-4997.patch1) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Open (was: Patch Available) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: HIVE-4997.1.patch HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch, HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Status: Patch Available (was: Open) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch, HIVE-4997.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4573) Support alternate table types for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729883#comment-13729883 ] Prasad Mujumdar commented on HIVE-4573: --- [~thejas] Thanks for setting up separate ticket for the default behavior. It's a good idea to decouple that discussion from the base implementation. Support alternate table types for HiveServer2 - Key: HIVE-4573 URL: https://issues.apache.org/jira/browse/HIVE-4573 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Johndee Burks Assignee: Prasad Mujumdar Priority: Minor Attachments: HIVE-4573.1.patch, HIVE-4573.2.patch The getTables jdbc function no longer returns information when using normal JDBC table types like TABLE or VIEW. You must now use a more specific type such as MANAGED_TABLE or VIRTUAL_VIEW. An example application that will fail to return results against 0.10 is below, works without issue in 0.9. In my 0.10 test I used HS2. {code} import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; import org.apache.hive.jdbc.HiveDriver; import java.sql.DatabaseMetaData; public class TestGet { private static String driverName = org.apache.hive.jdbc.HiveDriver; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection(jdbc:hive2://hostname:1/default); DatabaseMetaData dbmd = con.getMetaData(); String[] types = {TABLE}; ResultSet rs = dbmd.getTables(null, null, %, types); while (rs.next()) { System.out.println(rs.getString(TABLE_NAME)); } } } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: (was: HIVE-4997.patch) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: (was: HIVE-4997.1.patch) HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729885#comment-13729885 ] Hive QA commented on HIVE-4870: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596161/HIVE-4870.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2759 tests executed *Failed tests:* {noformat} org.apache.hcatalog.pig.TestOrcHCatLoaderComplexSchema.testTupleInBagInTupleInBag {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/309/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/309/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Intskirveli updated HIVE-4997: - Attachment: HIVE-4997.1.patch HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5001) [WebHCat] JobState is read/written with different user credentials
Eugene Koifman created HIVE-5001: Summary: [WebHCat] JobState is read/written with different user credentials Key: HIVE-5001 URL: https://issues.apache.org/jira/browse/HIVE-5001 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Eugene Koifman JobState can be persisted to HDFS or Zookeeper. At various points in the lifecycle it's accessed with different user credentials thus may cause errors depending on how permissions are set. Example: When submitting a MR job, templeton.JarDelegator is used. It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob with UserGroupInformation.doAs(). TempletonControllerJob will in turn create JobState and persist it. LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs() So in the later case it's possible that the persisted state of JobState by a different user than one that created/owns the file. templeton.tool.HDFSCleanup tries to delete these files w/o doAs. 'childid' file, for example, is created with rw-r--r--. and it's parent directory (job_201308051224_0001) has rwxr-xr-x. HDFSStorage doesn't set file permissions explicitly so it must be using default permissions. So there is a potential issue here (depending on UMASK) especially once HIVE-4601 is addressed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5001) [WebHCat] JobState is read/written with different user credentials
[ https://issues.apache.org/jira/browse/HIVE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5001: - Description: JobState can be persisted to HDFS or Zookeeper. At various points in the lifecycle it's accessed with different user credentials thus may cause errors depending on how permissions are set. Example: When submitting a MR job, templeton.JarDelegator is used. It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob with UserGroupInformation.doAs(). TempletonControllerJob will in turn create JobState and persist it. LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs() So in the later case it's possible that the persisted state of JobState by a different user than one that created/owns the file. templeton.tool.HDFSCleanup tries to delete these files w/o doAs. 'childid' file, for example, is created with rw-r--r--. and it's parent directory (job_201308051224_0001) has rwxr-xr-x. HDFSStorage doesn't set file permissions explicitly so it must be using default permissions. So there is a potential issue here (depending on UMASK) especially once HIVE-4601 is addressed. Actually, even w/o HIVE-4601 the user that owns the WebHCat process is likely different than the one submitting a request. was: JobState can be persisted to HDFS or Zookeeper. At various points in the lifecycle it's accessed with different user credentials thus may cause errors depending on how permissions are set. Example: When submitting a MR job, templeton.JarDelegator is used. It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob with UserGroupInformation.doAs(). TempletonControllerJob will in turn create JobState and persist it. LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs() So in the later case it's possible that the persisted state of JobState by a different user than one that created/owns the file. templeton.tool.HDFSCleanup tries to delete these files w/o doAs. 'childid' file, for example, is created with rw-r--r--. and it's parent directory (job_201308051224_0001) has rwxr-xr-x. HDFSStorage doesn't set file permissions explicitly so it must be using default permissions. So there is a potential issue here (depending on UMASK) especially once HIVE-4601 is addressed. [WebHCat] JobState is read/written with different user credentials -- Key: HIVE-5001 URL: https://issues.apache.org/jira/browse/HIVE-5001 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Eugene Koifman JobState can be persisted to HDFS or Zookeeper. At various points in the lifecycle it's accessed with different user credentials thus may cause errors depending on how permissions are set. Example: When submitting a MR job, templeton.JarDelegator is used. It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob with UserGroupInformation.doAs(). TempletonControllerJob will in turn create JobState and persist it. LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs() So in the later case it's possible that the persisted state of JobState by a different user than one that created/owns the file. templeton.tool.HDFSCleanup tries to delete these files w/o doAs. 'childid' file, for example, is created with rw-r--r--. and it's parent directory (job_201308051224_0001) has rwxr-xr-x. HDFSStorage doesn't set file permissions explicitly so it must be using default permissions. So there is a potential issue here (depending on UMASK) especially once HIVE-4601 is addressed. Actually, even w/o HIVE-4601 the user that owns the WebHCat process is likely different than the one submitting a request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4573) Support alternate table types for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729892#comment-13729892 ] Gunther Hagleitner commented on HIVE-4573: -- [~appodictic], [~thejas], [~prasadm] - If I understand this correctly this patch is needed as is and I am planning to commit in a few hours. The discussion on default, deprecation etc is happening on HIVE-4998. Speak up in the next couple of hours if you disagree. Support alternate table types for HiveServer2 - Key: HIVE-4573 URL: https://issues.apache.org/jira/browse/HIVE-4573 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Johndee Burks Assignee: Prasad Mujumdar Priority: Minor Attachments: HIVE-4573.1.patch, HIVE-4573.2.patch The getTables jdbc function no longer returns information when using normal JDBC table types like TABLE or VIEW. You must now use a more specific type such as MANAGED_TABLE or VIRTUAL_VIEW. An example application that will fail to return results against 0.10 is below, works without issue in 0.9. In my 0.10 test I used HS2. {code} import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; import org.apache.hive.jdbc.HiveDriver; import java.sql.DatabaseMetaData; public class TestGet { private static String driverName = org.apache.hive.jdbc.HiveDriver; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection(jdbc:hive2://hostname:1/default); DatabaseMetaData dbmd = con.getMetaData(); String[] types = {TABLE}; ResultSet rs = dbmd.getTables(null, null, %, types); while (rs.next()) { System.out.println(rs.getString(TABLE_NAME)); } } } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4051) Hive's metastore suffers from 1+N queries when querying partitions is slow
[ https://issues.apache.org/jira/browse/HIVE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729915#comment-13729915 ] Phabricator commented on HIVE-4051: --- ashutoshc has requested changes to the revision HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions is slow. Mostly looks good. Can you update the final patch with new class in its own file, with following two comments (if they looks alright.) INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 You are still selecting dbname, tblname ? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 Yes.. I think we should throw in those cases. Having empty list will mask the root problem if there is any which results from it. REVISION DETAIL https://reviews.facebook.net/D11805 BRANCH HIVE-4051 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe Cc: brock Hive's metastore suffers from 1+N queries when querying partitions is slow Key: HIVE-4051 URL: https://issues.apache.org/jira/browse/HIVE-4051 Project: Hive Issue Type: Bug Components: Clients, Metastore Environment: RHEL 6.3 / EC2 C1.XL Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: HIVE-4051.D11805.1.patch, HIVE-4051.D11805.2.patch, HIVE-4051.D11805.3.patch, HIVE-4051.D11805.4.patch, HIVE-4051.D11805.5.patch, HIVE-4051.D11805.6.patch, HIVE-4051.D11805.7.patch Hive's query client takes a long time to initialize start planning queries because of delays in creating all the MTable/MPartition objects. For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database. Several of those queries fetch exactly one row to create a single object on the client. The following 12 queries were repeated for each partition, generating a storm of SQL queries {code} 4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL) {code} This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client. The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation process it locally. Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729921#comment-13729921 ] Yu Gao commented on HIVE-2935: -- Maybe I missed the discussion here, but seems to me that HiveServer2 can be configured with either SASL GSS (Kerberos) or SASL PLAIN (LDAP, CUSTOM username/password authentication), but not both simultaneously. Can I ask the reason for this, and whether it is straightforward to enable PLAIN and GSS simultaneously in the future? This is very useful for applications that have been supporting LDAP authentication on Hive, and when turn to Kerberos, legacy clients or non-kerberos clients would still be able to access kerberized HiveServer2. Thanks! Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: HiveServer2, Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Fix For: 0.11.0 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935-7.patch.tar.gz, HIVE-2935-7.testerrs.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4990) ORC seeks fails with non-zero offset or column projection
[ https://issues.apache.org/jira/browse/HIVE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4990: Status: Patch Available (was: Open) ORC seeks fails with non-zero offset or column projection - Key: HIVE-4990 URL: https://issues.apache.org/jira/browse/HIVE-4990 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.11.1 Attachments: HIVE-4990.D12009.1.patch The ORC reader gets exceptions when seeking with non-zero offsets or column projection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4794) Unit e2e tests for vectorization
[ https://issues.apache.org/jira/browse/HIVE-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4794: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Tony! Unit e2e tests for vectorization Key: HIVE-4794 URL: https://issues.apache.org/jira/browse/HIVE-4794 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Assignee: Tony Murphy Fix For: vectorization-branch Attachments: HIVE-4794.1.patch, HIVE-4794.2.patch, HIVE-4794.3.patch, HIVE-4794.3-vectorization.patch, HIVE-4794.4-vectorization.patch, HIVE-4794.5-vectorization.patch, hive-4794.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4990) ORC seeks fails with non-zero offset or column projection
[ https://issues.apache.org/jira/browse/HIVE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729945#comment-13729945 ] Owen O'Malley commented on HIVE-4990: - The fix protects against null in seek and skiprows and subtracts off the missing firstRow for readers that are only reading a part of the file. ORC seeks fails with non-zero offset or column projection - Key: HIVE-4990 URL: https://issues.apache.org/jira/browse/HIVE-4990 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.11.1 Attachments: HIVE-4990.D12009.1.patch The ORC reader gets exceptions when seeking with non-zero offsets or column projection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4971) Unit test failure in TestVectorTimestampExpressions
[ https://issues.apache.org/jira/browse/HIVE-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4971: --- Resolution: Fixed Fix Version/s: vectorization-branch Status: Resolved (was: Patch Available) Committed to branch. Thanks, Gopal! Unit test failure in TestVectorTimestampExpressions --- Key: HIVE-4971 URL: https://issues.apache.org/jira/browse/HIVE-4971 Project: Hive Issue Type: Sub-task Components: Tests, UDF Affects Versions: vectorization-branch Reporter: Jitendra Nath Pandey Assignee: Gopal V Fix For: vectorization-branch Attachments: HIVE-4971.patch, HIVE-4971-vectorization.patch Unit test testVectorUDFUnixTimeStampLong is failing TestVectorTimestampExpressions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4990) ORC seeks fails with non-zero offset or column projection
[ https://issues.apache.org/jira/browse/HIVE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4990: -- Attachment: HIVE-4990.D12009.1.patch omalley requested code review of HIVE-4990 [jira] ORC seeks fails with non-zero offset or column projection. Reviewers: JIRA HIVE-4990 The ORC reader gets exceptions when seeking with non-zero offsets or column projection. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12009 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28683/ To: JIRA, omalley ORC seeks fails with non-zero offset or column projection - Key: HIVE-4990 URL: https://issues.apache.org/jira/browse/HIVE-4990 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.11.1 Attachments: HIVE-4990.D12009.1.patch The ORC reader gets exceptions when seeking with non-zero offsets or column projection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4995) select * may incorrectly return empty fields with hbase-handler
[ https://issues.apache.org/jira/browse/HIVE-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729950#comment-13729950 ] Hive QA commented on HIVE-4995: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596171/HIVE-4995.1.patch.txt {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2760 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/310/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/310/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. select * may incorrectly return empty fields with hbase-handler --- Key: HIVE-4995 URL: https://issues.apache.org/jira/browse/HIVE-4995 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-4995.1.patch.txt, HIVE-4995.1.patch.txt HIVE-3725 added capability to pull hbase columns with prefixes. However the way the current logic to add columns stands in HiveHBaseTableInput format, it might cause some columns to incorrectly display empty fields. Consider the following query: {noformat} CREATE EXTERNAL TABLE test_table(key string, value1 mapstring,string, value2 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf-a:prefix.*,cf-a:another_col) TBLPROPERTIES (hbase.table.name = test_table); {noformat} Given the existing logic in HiveHBaseTableInputFormat: {code} for (int i = 0; i columnsMapping.size(); i++) { ColumnMapping colMap = columnsMapping.get(i); if (colMap.hbaseRowKey) { continue; } if (colMap.qualifierName == null) { scan.addFamily(colMap.familyNameBytes); } else { scan.addColumn(colMap.familyNameBytes, colMap.qualifierNameBytes); } } {code} So for the above query, the 'addFamily' will be called first followed by 'addColumn' for the column family cf-a. This will wipe away whatever we had set with the 'addFamily' call in the previous step resulting in an empty column when queried. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4967) Don't serialize unnecessary fields in query plan
[ https://issues.apache.org/jira/browse/HIVE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729998#comment-13729998 ] Ashutosh Chauhan commented on HIVE-4967: Ping [~brocknoland] : ) Don't serialize unnecessary fields in query plan Key: HIVE-4967 URL: https://issues.apache.org/jira/browse/HIVE-4967 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-4967.patch There are quite a few fields which need not to be serialized since they are initialized anyways in backend. We need not to serialize them in our plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4967) Don't serialize unnecessary fields in query plan
[ https://issues.apache.org/jira/browse/HIVE-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730004#comment-13730004 ] Brock Noland commented on HIVE-4967: Hey thanks for the ping! I will review right now. Don't serialize unnecessary fields in query plan Key: HIVE-4967 URL: https://issues.apache.org/jira/browse/HIVE-4967 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-4967.patch There are quite a few fields which need not to be serialized since they are initialized anyways in backend. We need not to serialize them in our plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4964: -- Attachment: HIVE-4964.D11985.2.patch hbutani updated the revision HIVE-4964 [jira] Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced. - fix formatting issues - fix lint issues Reviewers: JIRA, ashutoshc REVISION DETAIL https://reviews.facebook.net/D11985 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11985?vs=36957id=37053#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDeserializer.java ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java To: JIRA, ashutoshc, hbutani Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730044#comment-13730044 ] Harish Butani commented on HIVE-4964: - Yes the formatting issues are long overdue. These are carryover from when we initially wrote a lot of this code; was using a different set of formatting rules. Cannot get eclipse to auto fix based on Hive's rules; so manually fixing them(as much as possible). So please bear with us... Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730038#comment-13730038 ] Hive QA commented on HIVE-4388: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596151/HIVE-4388.patch {color:green}SUCCESS:{color} +1 2759 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/311/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/311/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Inconsistent results with and without index. Is this a bug?
Hive Dev Team, Greetings! We have encountered some issue when using Hive 0.8.1.8 and Hive 0.11.0. After some investigation, we think this looks like a bug in Hive. I'm therefore sending this email to report this issue and to confirm with you. Please let me know if this is not the correct mailing list for this kind of topic. The issue we had is related to indexed queries on external tables stored as sequence file. For example, if we have a simple table like the one created below, CREATE TABLE hive_test ( id int, name string, info string ) STORED AS SEQUENCEFILE; We first insert 5000 rows with the same id (e.g., id = 1) into this table. We then count the total number of rows in this table by running the query below and get the correct result 5000. select count(*) from hive_test where id = 1; After this, we create an index on id, CREATE INDEX test_index ON TABLE hive_test(id) AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; ALTER INDEX test_index ON hive_test REBUILD; set hive.optimize.index.filter=true; set hive.optimize.index.filter.compact.minsize=0; Then, we run the same query 'select count(*) from hive_test where id = 1;' again but get a different result (count 5000). We tried to dig into the Hive source code and found the following piece of code in HiveIndexedInputFormat.java which might be the root cause of the duplicated rows, if (split.inputFormatClassName().contains(RCFile) || split.inputFormatClassName().contains(SequenceFile)) { if (split.getStart() SequenceFile.SYNC_INTERVAL) { newSplit = new HiveInputSplit(new FileSplit(split.getPath(), split.getStart() - SequenceFile.SYNC_INTERVAL, split.getLength() + SequenceFile.SYNC_INTERVAL, split.getLocations()), split.inputFormatClassName()); } } According to my understanding on SequenceFile and SequenceFileRecordReader, I think it's unnecessary and incorrect to add the extra 2000 bytes to the beginning of each input split because it actually causes some of the rows in the overlapping regions to be processed by two mappers. Please correct me if I'm wrong. Thank you, Xing
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730058#comment-13730058 ] Edward Capriolo commented on HIVE-4964: --- With eclipse I have had luck using this as my code-style settings. typically you can highlight some code or the entire file, right click and then ask eclipse to reformat. https://github.com/zznate/intravert-ug/blob/master/src/main/resources/eclipseUima_code_style_prefs.xml Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Priority: Minor Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Attachment: HIVE-4870.patch Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4870) Explain Extended to show partition info for Fetch Task
[ https://issues.apache.org/jira/browse/HIVE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-4870: - Attachment: (was: HIVE-4870.patch) Explain Extended to show partition info for Fetch Task -- Key: HIVE-4870 URL: https://issues.apache.org/jira/browse/HIVE-4870 Project: Hive Issue Type: Bug Components: Query Processor, Tests Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.11.1 Attachments: HIVE-4870.patch Explain extended does not include partition information for Fetch Task (FetchWork). Map Reduce Task (MapredWork)already does this. Patch includes Partition Description info to Fetch Task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4683) fix coverage org.apache.hadoop.hive.cli
[ https://issues.apache.org/jira/browse/HIVE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730101#comment-13730101 ] Hudson commented on HIVE-4683: -- SUCCESS: Integrated in Hive-trunk-h0.21 #2246 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2246/]) HIVE-4683 : fix coverage org.apache.hadoop.hive.cli (Aleksey Gorshkov via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510346) * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/RCFileCat.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestCliDriverMethods.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestCliSessionState.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestOptionsProcessor.java * /hive/trunk/cli/src/test/org/apache/hadoop/hive/cli/TestRCFileCat.java fix coverage org.apache.hadoop.hive.cli --- Key: HIVE-4683 URL: https://issues.apache.org/jira/browse/HIVE-4683 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Aleksey Gorshkov Assignee: Aleksey Gorshkov Fix For: 0.12.0 Attachments: HIVE-4683-branch-0.10.patch, HIVE-4683-branch-0.10-v1.patch, HIVE-4683-branch-0.11-v1.patch, HIVE-4683-trunk.patch, HIVE-4683-trunk-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables
[ https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730111#comment-13730111 ] Hive QA commented on HIVE-4997: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596206/HIVE-4997.1.patch {color:green}SUCCESS:{color} +1 2760 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/312/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/312/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. HCatalog doesn't allow multiple input tables Key: HIVE-4997 URL: https://issues.apache.org/jira/browse/HIVE-4997 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Intskirveli Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4997.1.patch HCatInputFormat does not allow reading from multiple hive tables in the same MapReduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4826) Setup build infrastructure for tez
[ https://issues.apache.org/jira/browse/HIVE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4826: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to tez branch. Setup build infrastructure for tez -- Key: HIVE-4826 URL: https://issues.apache.org/jira/browse/HIVE-4826 Project: Hive Issue Type: New Feature Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-4826.2.patch, HIVE-4826.patch Address changes required in ivy and build xml files to support tez. NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-5001) [WebHCat] JobState is read/written with different user credentials
[ https://issues.apache.org/jira/browse/HIVE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-5001: Assignee: Eugene Koifman [WebHCat] JobState is read/written with different user credentials -- Key: HIVE-5001 URL: https://issues.apache.org/jira/browse/HIVE-5001 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Eugene Koifman Assignee: Eugene Koifman JobState can be persisted to HDFS or Zookeeper. At various points in the lifecycle it's accessed with different user credentials thus may cause errors depending on how permissions are set. Example: When submitting a MR job, templeton.JarDelegator is used. It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob with UserGroupInformation.doAs(). TempletonControllerJob will in turn create JobState and persist it. LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs() So in the later case it's possible that the persisted state of JobState by a different user than one that created/owns the file. templeton.tool.HDFSCleanup tries to delete these files w/o doAs. 'childid' file, for example, is created with rw-r--r--. and it's parent directory (job_201308051224_0001) has rwxr-xr-x. HDFSStorage doesn't set file permissions explicitly so it must be using default permissions. So there is a potential issue here (depending on UMASK) especially once HIVE-4601 is addressed. Actually, even w/o HIVE-4601 the user that owns the WebHCat process is likely different than the one submitting a request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5002) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private
Owen O'Malley created HIVE-5002: --- Summary: Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private Key: HIVE-5002 URL: https://issues.apache.org/jira/browse/HIVE-5002 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Some users want to be able to access the rowIndexes directly from ORC reader extensions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5002) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private
[ https://issues.apache.org/jira/browse/HIVE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-5002: Component/s: File Formats Affects Version/s: 0.12.0 Fix Version/s: 0.12.0 Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private --- Key: HIVE-5002 URL: https://issues.apache.org/jira/browse/HIVE-5002 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Some users want to be able to access the rowIndexes directly from ORC reader extensions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5002) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private
[ https://issues.apache.org/jira/browse/HIVE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-5002: -- Attachment: HIVE-5002.D12015.1.patch omalley requested code review of HIVE-5002 [jira] Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private. Reviewers: JIRA HIVE-5002 Some users want to be able to access the rowIndexes directly from ORC reader extensions. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12015 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28713/ To: JIRA, omalley Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private --- Key: HIVE-5002 URL: https://issues.apache.org/jira/browse/HIVE-5002 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5002.D12015.1.patch Some users want to be able to access the rowIndexes directly from ORC reader extensions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4051) Hive's metastore suffers from 1+N queries when querying partitions is slow
[ https://issues.apache.org/jira/browse/HIVE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4051: -- Attachment: HIVE-4051.D11805.8.patch sershe updated the revision HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions is slow. Moved the code for SQL filter generation and usage into separate class. The only other changes are latest two comments on Phabricator, as well as some minor cleanup like null checks. Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D11805 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11805?vs=36879id=37071#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java To: JIRA, ashutoshc, sershe Cc: brock Hive's metastore suffers from 1+N queries when querying partitions is slow Key: HIVE-4051 URL: https://issues.apache.org/jira/browse/HIVE-4051 Project: Hive Issue Type: Bug Components: Clients, Metastore Environment: RHEL 6.3 / EC2 C1.XL Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: HIVE-4051.D11805.1.patch, HIVE-4051.D11805.2.patch, HIVE-4051.D11805.3.patch, HIVE-4051.D11805.4.patch, HIVE-4051.D11805.5.patch, HIVE-4051.D11805.6.patch, HIVE-4051.D11805.7.patch, HIVE-4051.D11805.8.patch Hive's query client takes a long time to initialize start planning queries because of delays in creating all the MTable/MPartition objects. For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database. Several of those queries fetch exactly one row to create a single object on the client. The following 12 queries were repeated for each partition, generating a storm of SQL queries {code} 4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL) {code} This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client. The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation process it locally. Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count. -- This message is
[jira] [Resolved] (HIVE-4916) Add TezWork
[ https://issues.apache.org/jira/browse/HIVE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-4916. -- Resolution: Fixed Committed to tez branch. Add TezWork --- Key: HIVE-4916 URL: https://issues.apache.org/jira/browse/HIVE-4916 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Attachments: HIVE-4916.1.patch.branch, HIVE-4916.2.patch.txt TezWork is the class that encapsulates all the info needed to execute a single Tez job (i.e.: a dag of map or reduce work). NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4051) Hive's metastore suffers from 1+N queries when querying partitions is slow
[ https://issues.apache.org/jira/browse/HIVE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730137#comment-13730137 ] Phabricator commented on HIVE-4051: --- sershe has commented on the revision HIVE-4051 [jira] Hive's metastore suffers from 1+N queries when querying partitions is slow. INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1681 fixed metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1776 fixed REVISION DETAIL https://reviews.facebook.net/D11805 To: JIRA, ashutoshc, sershe Cc: brock Hive's metastore suffers from 1+N queries when querying partitions is slow Key: HIVE-4051 URL: https://issues.apache.org/jira/browse/HIVE-4051 Project: Hive Issue Type: Bug Components: Clients, Metastore Environment: RHEL 6.3 / EC2 C1.XL Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: HIVE-4051.D11805.1.patch, HIVE-4051.D11805.2.patch, HIVE-4051.D11805.3.patch, HIVE-4051.D11805.4.patch, HIVE-4051.D11805.5.patch, HIVE-4051.D11805.6.patch, HIVE-4051.D11805.7.patch, HIVE-4051.D11805.8.patch Hive's query client takes a long time to initialize start planning queries because of delays in creating all the MTable/MPartition objects. For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database. Several of those queries fetch exactly one row to create a single object on the client. The following 12 queries were repeated for each partition, generating a storm of SQL queries {code} 4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL) {code} This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client. The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation process it locally. Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5002) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private
[ https://issues.apache.org/jira/browse/HIVE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-5002: Status: Patch Available (was: Open) Loosen readRowIndex visibility in ORC's RecordReaderImpl to package private --- Key: HIVE-5002 URL: https://issues.apache.org/jira/browse/HIVE-5002 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.12.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5002.D12015.1.patch Some users want to be able to access the rowIndexes directly from ORC reader extensions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4991) hive build with 0.20 is broken
[ https://issues.apache.org/jira/browse/HIVE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4991: Attachment: HIVE-4991.3.patch.txt HIVE-4991.3.patch.txt - change to use version number from ivy/libraries.properties hive build with 0.20 is broken -- Key: HIVE-4991 URL: https://issues.apache.org/jira/browse/HIVE-4991 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Edward Capriolo Priority: Blocker Labels: newbie Attachments: HIVE-4991.2.patch.txt, HIVE-4991.3.patch.txt, HIVE-4991.patch.txt As reported in HIVE-4911 ant clean package -Dhadoop.mr.rev=20 Fails with - {code} compile: [echo] Project: ql [javac] Compiling 898 source files to /Users/malakar/code/oss/hive/build/ql/classes [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35: package org.apache.commons.io does not exist [javac] import org.apache.commons.io.FileUtils; [javac] ^ [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743: cannot find symbol [javac] symbol : variable FileUtils [javac] location: class org.apache.hadoop.hive.ql.session.SessionState [javac] FileUtils.deleteDirectory(resourceDir); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4991) hive build with 0.20 is broken
[ https://issues.apache.org/jira/browse/HIVE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730182#comment-13730182 ] Gunther Hagleitner commented on HIVE-4991: -- [~appodictic] If you're fine with patch .3 (only difference is to reuse existing property) I'll commit that. hive build with 0.20 is broken -- Key: HIVE-4991 URL: https://issues.apache.org/jira/browse/HIVE-4991 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Edward Capriolo Priority: Blocker Labels: newbie Attachments: HIVE-4991.2.patch.txt, HIVE-4991.3.patch.txt, HIVE-4991.patch.txt As reported in HIVE-4911 ant clean package -Dhadoop.mr.rev=20 Fails with - {code} compile: [echo] Project: ql [javac] Compiling 898 source files to /Users/malakar/code/oss/hive/build/ql/classes [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35: package org.apache.commons.io does not exist [javac] import org.apache.commons.io.FileUtils; [javac] ^ [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743: cannot find symbol [javac] symbol : variable FileUtils [javac] location: class org.apache.hadoop.hive.ql.session.SessionState [javac] FileUtils.deleteDirectory(resourceDir); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4991) hive build with 0.20 is broken
[ https://issues.apache.org/jira/browse/HIVE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730185#comment-13730185 ] Gunther Hagleitner commented on HIVE-4991: -- [~amalakar] I cannot reproduce your findings. With Ed's patch in place everything compiles for me for 20, 20S and 23. All of them have commons-io on the classpath now. hive build with 0.20 is broken -- Key: HIVE-4991 URL: https://issues.apache.org/jira/browse/HIVE-4991 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Edward Capriolo Priority: Blocker Labels: newbie Attachments: HIVE-4991.2.patch.txt, HIVE-4991.3.patch.txt, HIVE-4991.patch.txt As reported in HIVE-4911 ant clean package -Dhadoop.mr.rev=20 Fails with - {code} compile: [echo] Project: ql [javac] Compiling 898 source files to /Users/malakar/code/oss/hive/build/ql/classes [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35: package org.apache.commons.io does not exist [javac] import org.apache.commons.io.FileUtils; [javac] ^ [javac] /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743: cannot find symbol [javac] symbol : variable FileUtils [javac] location: class org.apache.hadoop.hive.ql.session.SessionState [javac] FileUtils.deleteDirectory(resourceDir); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4573) Support alternate table types for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4573: - Resolution: Fixed Fix Version/s: 0.12.0 Release Note: Adds new config parameter that needs to be documented: property namehive.server2.table.type.mapping/name valueHIVE/value description This setting reflects how HiveServer will report the table types for JDBC and other client implementations that retrieves the available tables and supported table types HIVE : Exposes the hive's native table tyes like MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW CLASSIC : More generic types like TABLE and VIEW /description /property Status: Resolved (was: Patch Available) Committed to trunk. Thanks Prasad! Support alternate table types for HiveServer2 - Key: HIVE-4573 URL: https://issues.apache.org/jira/browse/HIVE-4573 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Johndee Burks Assignee: Prasad Mujumdar Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4573.1.patch, HIVE-4573.2.patch The getTables jdbc function no longer returns information when using normal JDBC table types like TABLE or VIEW. You must now use a more specific type such as MANAGED_TABLE or VIRTUAL_VIEW. An example application that will fail to return results against 0.10 is below, works without issue in 0.9. In my 0.10 test I used HS2. {code} import java.sql.SQLException; import java.sql.Connection; import java.sql.ResultSet; import java.sql.Statement; import java.sql.DriverManager; import org.apache.hive.jdbc.HiveDriver; import java.sql.DatabaseMetaData; public class TestGet { private static String driverName = org.apache.hive.jdbc.HiveDriver; /** * @param args * @throws SQLException */ public static void main(String[] args) throws SQLException { try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection(jdbc:hive2://hostname:1/default); DatabaseMetaData dbmd = con.getMetaData(); String[] types = {TABLE}; ResultSet rs = dbmd.getTables(null, null, %, types); while (rs.next()) { System.out.println(rs.getString(TABLE_NAME)); } } } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4051) Hive's metastore suffers from 1+N queries when querying partitions is slow
[ https://issues.apache.org/jira/browse/HIVE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-4051: --- Status: Open (was: Patch Available) Hive's metastore suffers from 1+N queries when querying partitions is slow Key: HIVE-4051 URL: https://issues.apache.org/jira/browse/HIVE-4051 Project: Hive Issue Type: Bug Components: Clients, Metastore Environment: RHEL 6.3 / EC2 C1.XL Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: HIVE-4051.D11805.1.patch, HIVE-4051.D11805.2.patch, HIVE-4051.D11805.3.patch, HIVE-4051.D11805.4.patch, HIVE-4051.D11805.5.patch, HIVE-4051.D11805.6.patch, HIVE-4051.D11805.7.patch, HIVE-4051.D11805.8.patch Hive's query client takes a long time to initialize start planning queries because of delays in creating all the MTable/MPartition objects. For a hive db with 1800 partitions, the metastore took 6-7 seconds to initialize - firing approximately 5900 queries to the mysql database. Several of those queries fetch exactly one row to create a single object on the client. The following 12 queries were repeated for each partition, generating a storm of SQL queries {code} 4 Query SELECT `A0`.`SD_ID`,`B0`.`INPUT_FORMAT`,`B0`.`IS_COMPRESSED`,`B0`.`IS_STOREDASSUBDIRECTORIES`,`B0`.`LOCATION`,`B0`.`NUM_BUCKETS`,`B0`.`OUTPUT_FORMAT`,`B0`.`SD_ID` FROM `PARTITIONS` `A0` LEFT OUTER JOIN `SDS` `B0` ON `A0`.`SD_ID` = `B0`.`SD_ID` WHERE `A0`.`PART_ID` = 3945 4 Query SELECT `A0`.`CD_ID`,`B0`.`CD_ID` FROM `SDS` `A0` LEFT OUTER JOIN `CDS` `B0` ON `A0`.`CD_ID` = `B0`.`CD_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `COLUMNS_V2` THIS WHERE THIS.`CD_ID`=1546 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COMMENT`,`A0`.`COLUMN_NAME`,`A0`.`TYPE_NAME`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `COLUMNS_V2` `A0` WHERE `A0`.`CD_ID` = 1546 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT `A0`.`SERDE_ID`,`B0`.`NAME`,`B0`.`SLIB`,`B0`.`SERDE_ID` FROM `SDS` `A0` LEFT OUTER JOIN `SERDES` `B0` ON `A0`.`SERDE_ID` = `B0`.`SERDE_ID` WHERE `A0`.`SD_ID` =4871 4 Query SELECT COUNT(*) FROM `SORT_COLS` THIS WHERE THIS.`SD_ID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT `A0`.`COLUMN_NAME`,`A0`.`ORDER`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SORT_COLS` `A0` WHERE `A0`.`SD_ID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_VALUES` THIS WHERE THIS.`SD_ID_OID`=4871 AND THIS.`INTEGER_IDX`=0 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A1`.`STRING_LIST_ID`,`A0`.`INTEGER_IDX` AS NUCORDER0 FROM `SKEWED_VALUES` `A0` INNER JOIN `SKEWED_STRING_LIST` `A1` ON `A0`.`STRING_LIST_ID_EID` = `A1`.`STRING_LIST_ID` WHERE `A0`.`SD_ID_OID` =4871 AND `A0`.`INTEGER_IDX` = 0 ORDER BY NUCORDER0 4 Query SELECT COUNT(*) FROM `SKEWED_COL_VALUE_LOC_MAP` WHERE `SD_ID` =4871 AND `STRING_LIST_ID_KID` IS NOT NULL 4 Query SELECT 'org.apache.hadoop.hive.metastore.model.MStringList' AS NUCLEUS_TYPE,`A0`.`STRING_LIST_ID` FROM `SKEWED_STRING_LIST` `A0` INNER JOIN `SKEWED_COL_VALUE_LOC_MAP` `B0` ON `A0`.`STRING_LIST_ID` = `B0`.`STRING_LIST_ID_KID` WHERE `B0`.`SD_ID` =4871 4 Query SELECT `A0`.`STRING_LIST_ID_KID`,`A0`.`LOCATION` FROM `SKEWED_COL_VALUE_LOC_MAP` `A0` WHERE `A0`.`SD_ID` =4871 AND NOT (`A0`.`STRING_LIST_ID_KID` IS NULL) {code} This data is not detached or cached, so this operation is performed during every query plan for the partitions, even in the same hive client. The queries are automatically generated by JDO/DataNucleus which makes it nearly impossible to rewrite it into a single denormalized join operation process it locally. Attempts to optimize this with JDO fetch-groups did not bear fruit in improving the query count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4990) ORC seeks fails with non-zero offset or column projection
[ https://issues.apache.org/jira/browse/HIVE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730235#comment-13730235 ] Hive QA commented on HIVE-4990: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596214/HIVE-4990.D12009.1.patch {color:green}SUCCESS:{color} +1 2759 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/314/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/314/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ORC seeks fails with non-zero offset or column projection - Key: HIVE-4990 URL: https://issues.apache.org/jira/browse/HIVE-4990 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.11.1 Attachments: HIVE-4990.D12009.1.patch The ORC reader gets exceptions when seeking with non-zero offsets or column projection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira