[jira] [Assigned] (HIVE-4160) Vectorized Query Execution in Hive
[ https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HIVE-4160: -- Assignee: Jitendra Nath Pandey (was: Tony Murphy) Vectorized Query Execution in Hive -- Key: HIVE-4160 URL: https://issues.apache.org/jira/browse/HIVE-4160 Project: Hive Issue Type: New Feature Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: Hive-Vectorized-Query-Execution-Design.docx, Hive-Vectorized-Query-Execution-Design-rev10.docx, Hive-Vectorized-Query-Execution-Design-rev10.docx, Hive-Vectorized-Query-Execution-Design-rev10.pdf, Hive-Vectorized-Query-Execution-Design-rev2.docx, Hive-Vectorized-Query-Execution-Design-rev3.docx, Hive-Vectorized-Query-Execution-Design-rev3.docx, Hive-Vectorized-Query-Execution-Design-rev3.pdf, Hive-Vectorized-Query-Execution-Design-rev4.docx, Hive-Vectorized-Query-Execution-Design-rev4.pdf, Hive-Vectorized-Query-Execution-Design-rev5.docx, Hive-Vectorized-Query-Execution-Design-rev5.pdf, Hive-Vectorized-Query-Execution-Design-rev6.docx, Hive-Vectorized-Query-Execution-Design-rev6.pdf, Hive-Vectorized-Query-Execution-Design-rev7.docx, Hive-Vectorized-Query-Execution-Design-rev8.docx, Hive-Vectorized-Query-Execution-Design-rev8.pdf, Hive-Vectorized-Query-Execution-Design-rev9.docx, Hive-Vectorized-Query-Execution-Design-rev9.pdf The Hive query execution engine currently processes one row at a time. A single row of data goes through all the operators before the next row can be processed. This mode of processing is very inefficient in terms of CPU usage. Research has demonstrated that this yields very low instructions per cycle [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization and data columns go through a layer of object inspectors that identify column type, deserialize data and determine appropriate expression routines in the inner loop. These layers of virtual method calls further slow down the processing. This work will add support for vectorized query execution to Hive, where, instead of individual rows, batches of about a thousand rows at a time are processed. Each column in the batch is represented as a vector of a primitive data type. The inner loop of execution scans these vectors very fast, avoiding method calls, deserialization, unnecessary if-then-else, etc. This substantially reduces CPU time used, and gives excellent instructions per cycle (i.e. improved processor pipeline utilization). See the attached design specification for more details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739267#comment-13739267 ] Navis commented on HIVE-4579: - Sorry for late comment, but would it be better to remove MINA dependency only for IdentityHashSet? Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: h-4579.patch, HIVE-4579.4.patch, HIVE-4579.D11409.1.patch, HIVE-4579.D11409.2.patch, HIVE-4579.D11409.3.patch, pushdown.pdf I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739272#comment-13739272 ] Teddy Choi commented on HIVE-5022: -- It seems like that multiplication is not the only one makes this error. Multiple additions, subtractions, and a power after a division can make this error, too. So I will update the patch. Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4569: -- Attachment: HIVE-4569.D12231.1.patch jaideepdhok requested code review of Changes based on previous code review HIVE-4569 [jira] GetQueryPlan api in Hive Server2. Reviewers: JIRA HIVE-4569 changes to service package It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. TEST PLAN Unit tests included REVISION DETAIL https://reviews.facebook.net/D12231 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/Driver.java ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java service/if/TCLIService.thrift service/src/java/org/apache/hive/service/cli/CLIService.java service/src/java/org/apache/hive/service/cli/CLIServiceClient.java service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java service/src/java/org/apache/hive/service/cli/ICLIService.java service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java service/src/java/org/apache/hive/service/cli/operation/Operation.java service/src/java/org/apache/hive/service/cli/operation/OperationManager.java service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java service/src/java/org/apache/hive/service/cli/session/HiveSession.java service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java service/src/java/org/apache/hive/service/cli/session/SessionManager.java service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java service/src/test/org/apache/hive/service/cli/CLIServiceTest.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/29235/ To: JIRA, jaideepdhok GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-5022 started by Teddy Choi. Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-5022: - Attachment: HIVE-5022.2.patch.txt Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-5022: - Status: Patch Available (was: In Progress) Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739276#comment-13739276 ] Jaideep Dhok commented on HIVE-4569: [~vgumashta] Initially it was split into three JIRAs, but other people suggested that it would be easier to track progress in a single JIRA. I've completed most of the changes, and have updated based on last review by [~cwsteinbach] GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4569: -- Attachment: HIVE-4569.D12237.1.patch jaideepdhok requested code review of HIVE-4569 [jira] GetQueryPlan api in Hive Server2 changes post last review. Reviewers: JIRA Changes for HIVE-4569 post review It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. TEST PLAN unit tests included REVISION DETAIL https://reviews.facebook.net/D12237 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java service/src/gen/thrift/gen-cpp/TCLIService.cpp service/src/gen/thrift/gen-cpp/TCLIService.h service/src/gen/thrift/gen-cpp/TCLIService_server.skeleton.cpp service/src/gen/thrift/gen-cpp/TCLIService_types.cpp service/src/gen/thrift/gen-cpp/TCLIService_types.h service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIService.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementAsyncReq.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementAsyncResp.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetOperationStatusResp.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanReq.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanResp.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStructTypeEntry.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TUnionTypeEntry.java service/src/gen/thrift/gen-php/TCLIService.php service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote service/src/gen/thrift/gen-py/TCLIService/TCLIService.py service/src/gen/thrift/gen-py/TCLIService/ttypes.py service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote service/src/gen/thrift/gen-rb/t_c_l_i_service.rb service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb service/src/java/org/apache/hive/service/cli/OperationStatus.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/29241/ To: JIRA, jaideepdhok GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739279#comment-13739279 ] Jaideep Dhok commented on HIVE-4569: Sorry for the duplicate review request. Please refer to the last one. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739285#comment-13739285 ] Teddy Choi commented on HIVE-5022: -- Review request on https://reviews.apache.org/r/13553/ Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW
[ https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2608: -- Attachment: HIVE-2608.D4317.8.patch navis updated the revision HIVE-2608 [jira] Do not require AS a,b,c part in LATERAL VIEW. Rebased to trunk fixed test fails Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D4317 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D4317?vs=36597id=37845#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientnegative/udtf_not_supported2.q ql/src/test/queries/clientpositive/lateral_view_noalias.q ql/src/test/results/clientnegative/lateral_view_join.q.out ql/src/test/results/clientnegative/udtf_not_supported2.q.out ql/src/test/results/clientpositive/lateral_view_noalias.q.out To: JIRA, ashutoshc, navis Cc: ikabiljo Do not require AS a,b,c part in LATERAL VIEW Key: HIVE-2608 URL: https://issues.apache.org/jira/browse/HIVE-2608 Project: Hive Issue Type: Improvement Components: Query Processor, UDF Reporter: Igor Kabiljo Assignee: Navis Priority: Minor Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch Currently, it is required to state column names when LATERAL VIEW is used. That shouldn't be necessary, since UDTF returns struct which contains column names - and they should be used by default. For example, it would be great if this was possible: SELECT t.*, t.key1 + t.key4 FROM some_table LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW
[ https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2608: Status: Patch Available (was: Open) Do not require AS a,b,c part in LATERAL VIEW Key: HIVE-2608 URL: https://issues.apache.org/jira/browse/HIVE-2608 Project: Hive Issue Type: Improvement Components: Query Processor, UDF Reporter: Igor Kabiljo Assignee: Navis Priority: Minor Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch Currently, it is required to state column names when LATERAL VIEW is used. That shouldn't be necessary, since UDTF returns struct which contains column names - and they should be used by default. For example, it would be great if this was possible: SELECT t.*, t.key1 + t.key4 FROM some_table LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC
[ https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739319#comment-13739319 ] Hive QA commented on HIVE-4246: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597820/HIVE-4246.D11415.4.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 2876 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testPartialPlan org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_join1 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/427/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/427/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. Implement predicate pushdown for ORC Key: HIVE-4246 URL: https://issues.apache.org/jira/browse/HIVE-4246 Project: Hive Issue Type: New Feature Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch By using the push down predicates from the table scan operator, ORC can skip over 10,000 rows at a time that won't satisfy the predicate. This will help a lot, especially if the file is sorted by the column that is used in the predicate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739324#comment-13739324 ] Hive QA commented on HIVE-3562: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12580643/HIVE-3562.D5967.5.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/429/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/429/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-429/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java' Reverted 'ql/src/test/results/compiler/plan/join2.q.xml' Reverted 'ql/src/test/results/compiler/plan/input2.q.xml' Reverted 'ql/src/test/results/compiler/plan/join3.q.xml' Reverted 'ql/src/test/results/compiler/plan/input3.q.xml' Reverted 'ql/src/test/results/compiler/plan/join4.q.xml' Reverted 'ql/src/test/results/compiler/plan/input4.q.xml' Reverted 'ql/src/test/results/compiler/plan/join5.q.xml' Reverted 'ql/src/test/results/compiler/plan/input5.q.xml' Reverted 'ql/src/test/results/compiler/plan/join6.q.xml' Reverted 'ql/src/test/results/compiler/plan/input_testxpath2.q.xml' Reverted 'ql/src/test/results/compiler/plan/input6.q.xml' Reverted 'ql/src/test/results/compiler/plan/join7.q.xml' Reverted 'ql/src/test/results/compiler/plan/input7.q.xml' Reverted 'ql/src/test/results/compiler/plan/input_testsequencefile.q.xml' Reverted 'ql/src/test/results/compiler/plan/input8.q.xml' Reverted 'ql/src/test/results/compiler/plan/join8.q.xml' Reverted 'ql/src/test/results/compiler/plan/union.q.xml' Reverted 'ql/src/test/results/compiler/plan/input9.q.xml' Reverted 'ql/src/test/results/compiler/plan/udf1.q.xml' Reverted 'ql/src/test/results/compiler/plan/udf4.q.xml' Reverted 'ql/src/test/results/compiler/plan/input_testxpath.q.xml' Reverted 'ql/src/test/results/compiler/plan/udf6.q.xml' Reverted 'ql/src/test/results/compiler/plan/input_part1.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby1.q.xml' Reverted 'ql/src/test/results/compiler/plan/udf_case.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby2.q.xml' Reverted 'ql/src/test/results/compiler/plan/subq.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby3.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby4.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby5.q.xml' Reverted 'ql/src/test/results/compiler/plan/groupby6.q.xml' Reverted 'ql/src/test/results/compiler/plan/case_sensitivity.q.xml' Reverted 'ql/src/test/results/compiler/plan/udf_when.q.xml' Reverted 'ql/src/test/results/compiler/plan/input20.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample1.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample2.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample3.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample4.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample5.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample6.q.xml' Reverted 'ql/src/test/results/compiler/plan/sample7.q.xml' Reverted 'ql/src/test/results/compiler/plan/cast1.q.xml' Reverted 'ql/src/test/results/compiler/plan/join1.q.xml' Reverted 'ql/src/test/results/compiler/plan/input1.q.xml' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestIntegerCompressionReader.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitFieldReader.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthIntegerReader.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthByteReader.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitPack.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInStream.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/orc/BitFieldReader.java' Reverted
Review Request 13555: HIVE-5052: Set parallelism when generating the tez tasks
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13555/ --- Review request for hive. Bugs: HIVE-5052 https://issues.apache.org/jira/browse/HIVE-5052 Repository: hive-git Description --- Set parallelism when generating the tez tasks. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7408a5a ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java edb55fa ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 48145ad Diff: https://reviews.apache.org/r/13555/diff/ Testing --- Thanks, Vikram Dixit Kumaraswamy
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739334#comment-13739334 ] Thejas M Nair commented on HIVE-4569: - [~jaideepdhok] The patch on phabricator links look incomplete, for example it is missing service/if/TCLIService.thrift. Can you update the patch in the phabricator link with original review comments (https://reviews.facebook.net/D11469) ? That way it is easier to track changes across patches. Having a new phabricator link for each patch iteration makes it difficult to follow the changes between patches. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5052) Set parallelism when generating the tez tasks
[ https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5052: - Attachment: HIVE-5052.1.patch.txt Accomplished in a manner similar to what happens in the map-reduce path of the code. Set parallelism when generating the tez tasks - Key: HIVE-5052 URL: https://issues.apache.org/jira/browse/HIVE-5052 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5052.1.patch.txt In GenTezTask any intermediate task has parallelism set to 1. This needs to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 13555: HIVE-5052: Set parallelism when generating the tez tasks
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13555/ --- (Updated Aug. 14, 2013, 7:15 a.m.) Review request for hive. Changes --- Removed cruft. Bugs: HIVE-5052 https://issues.apache.org/jira/browse/HIVE-5052 Repository: hive-git Description --- Set parallelism when generating the tez tasks. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7408a5a ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java edb55fa ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 48145ad Diff: https://reviews.apache.org/r/13555/diff/ Testing --- Thanks, Vikram Dixit Kumaraswamy
[jira] [Updated] (HIVE-5052) Set parallelism when generating the tez tasks
[ https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5052: - Attachment: HIVE-5052.2.patch.txt Removed cruft. Set parallelism when generating the tez tasks - Key: HIVE-5052 URL: https://issues.apache.org/jira/browse/HIVE-5052 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5052.1.patch.txt, HIVE-5052.2.patch.txt In GenTezTask any intermediate task has parallelism set to 1. This needs to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739363#comment-13739363 ] Thejas M Nair commented on HIVE-4569: - [~jaideepdhok] [~cwsteinbach] Should we keep the api simple (small) by just making the current execute function asynchronous instead of adding an additional execute function in the api ? I think [~henryr] has a good point that it was always documented to be asynchronous (it just happened that it always was so late in returning the call that the operation was finished :) ). Also, I think it makes sense to make the GetResultSetMetadata and FetchResults api blocking until operation finishes, instead of throwing an error if status is not FINISHED. This will also help to prevent breakage of any user code that was written with the assumption that execute is blocking. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4822) implement vectorized math functions
[ https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739375#comment-13739375 ] Teddy Choi commented on HIVE-4822: -- I had some difficulties to apply this patch today. By applying HIVE-4989 on the vectorization branch, outputDirectory and templateDirectory was changed. So this patch needs a few updates. However, the code looks good. :) implement vectorized math functions --- Key: HIVE-4822 URL: https://issues.apache.org/jira/browse/HIVE-4822 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, HIVE-4822.5-vectorization.patch Implement vectorized support for the all the built-in math functions. This includes implementing the vectorized operation, and tying it all together in VectorizationContext so it runs end-to-end. These functions include: round(Col) Round(Col, N) Floor(Col) Ceil(Col) Rand(), Rand(seed) Exp(Col) Ln(Col) Log10(Col) Log2(Col) Log(base, Col) Pow(col, p), Power(col, p) Sqrt(Col) Bin(Col) Hex(Col) Unhex(Col) Conv(Col, from_base, to_base) Abs(Col) Pmod(arg1, arg2) Sin(Col) Asin(Col) Cos(Col) ACos(Col) Atan(Col) Degrees(Col) Radians(Col) Positive(Col) Negative(Col) Sign(Col) E() Pi() To reduce the total code volume, do an implicit type cast from non-double input types to double. Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so reuse code for those as appropriate. Try to call the function directly in the inner loop and avoid new() or expensive operations, as appropriate. Templatize the code where appropriate, e.g. all the unary function of form DOUBLE func(DOUBLE) can probably be done with a template. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739385#comment-13739385 ] Jaideep Dhok commented on HIVE-4569: bq. Having a new phabricator link for each patch iteration makes it difficult to follow the changes between patches. [~thejas] Looks like the changes got split into two requests. Unfortunately I am unable to update the previous revision, as I had lost the previous arc commit. I will put up a new request, and keep updating it if there are further comments? bq. Should we keep the api simple (small) by just making the current execute function asynchronous instead of adding an additional execute function in the api ? [~thejas] I think making executeStatement async by default may break users' expectations since it's a blocking call. [~cwsteinbach] Had suggested earlier to create two separate calls executeStatement and executeStatementAsync so that the API is easier to understand. I agree with that approach. If we have two different calls, then users can pick one based on their need. For getting result set in case of async the flow would be - ExecuteStatementAsync, GetOperationStatus (until query completes), then fetch result set. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4989) Consolidate and simplify vectorization code and test generation
[ https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739403#comment-13739403 ] Teddy Choi commented on HIVE-4989: -- [~ashutoshc], I could not compile the latest code on vectorization branch. I have double checked it. It seems like there was an error in applying the patch. Please check it again. :) * ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java : the old location. * ql/src/gen/vectorization/org/apache/hadoop/hive/ql/exec/vector/gen/CodeGen.java : expected location on https://reviews.apache.org/r/13274/diff/ * ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java : actual location in the vectorization branch on https://github.com/apache/hive/commit/e6f59f5d0711c52badc89868e4178a1b2ef54e53 Consolidate and simplify vectorization code and test generation --- Key: HIVE-4989 URL: https://issues.apache.org/jira/browse/HIVE-4989 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Assignee: Tony Murphy Fix For: vectorization-branch Attachments: HIVE-4989-vectorization.patch The current code generation is unwieldy to use and prone to errors. This change consolidates all the code and test generation into a single location, and removes the need to manually place files which can lead to missing or incomplete code or tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739420#comment-13739420 ] Vaibhav Gumashta commented on HIVE-4569: [~thejas] I think you mean by making GetResultSetMetadata and FetchResults API blocking, we can change the executeStatement to async by default but at the same time not break any user code? GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Status: Open (was: Patch Available) Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739423#comment-13739423 ] Benjamin Jakobus commented on HIVE-5018: Mhh, sorry: I don't quite get this: it compiles for me using ant package... Is there something else that I am missing? Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: (was: HIVE-5018.1.patch.txt) Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Attachment: HIVE-5018.1.patch.txt Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4778) hive.server2.authentication CUSTOM not working
[ https://issues.apache.org/jira/browse/HIVE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739428#comment-13739428 ] Hive QA commented on HIVE-4778: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597852/HIVE-4778.D12213.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2857 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/431/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/431/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. hive.server2.authentication CUSTOM not working -- Key: HIVE-4778 URL: https://issues.apache.org/jira/browse/HIVE-4778 Project: Hive Issue Type: Bug Components: Authentication Affects Versions: 0.11.0 Environment: CentOS release 6.2 x86_64 java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Zdenek Ott Assignee: Azrael Park Attachments: HIVE-4778.D12207.1.patch, HIVE-4778.D12213.1.patch I have created my own class PamAuthenticationProvider that implements PasswdAuthenticationProvider interface. I have puted jar into hive lib directory and have configured hive-site.xml in following way: property namehive.server2.authentication/name valueCUSTOM/value /property property namehive.server2.custom.authentication.class/name valuecom.avast.ff.hive.PamAuthenticationProvider/value /property I use SQuireL and jdbc drivers to connect to hive. During authentication Hive throws following exception: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.init(CustomAuthenticationProviderImpl.java:20) at org.apache.hive.service.auth.AuthenticationProviderFactory.getAuthenticationProvider(AuthenticationProviderFactory.java:57) at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:61) at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:127) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:509) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:264) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 12 more I have done small patch for org.apache.hive.service.auth.CustomAuthenticationProviderImpl , that have solved my problem, but I'm not sure if it's the best solution. Here is the patch: --- CustomAuthenticationProviderImpl.java 2013-06-20 14:55:22.473995184 +0200 +++ CustomAuthenticationProviderImpl.java.new 2013-06-20 14:57:36.549012966 +0200 @@ -33,7 +33,7 @@ HiveConf conf = new HiveConf(); this.customHandlerClass = (Class? extends PasswdAuthenticationProvider) conf.getClass( - HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.name(), + HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.varname, PasswdAuthenticationProvider.class); this.customProvider = ReflectionUtils.newInstance(this.customHandlerClass, conf); -- This message is
[jira] [Updated] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5018: --- Status: Patch Available (was: Open) Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5018) Avoiding object instantiation in loops (issue 6)
[ https://issues.apache.org/jira/browse/HIVE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739431#comment-13739431 ] Benjamin Jakobus commented on HIVE-5018: Sorry Brock - whenever I run ant -Dhadoop.version=1.2.1 package the build succeeds...It's odd that I don't catch the compile time errors. Is there something that I am missing? Avoiding object instantiation in loops (issue 6) Key: HIVE-5018 URL: https://issues.apache.org/jira/browse/HIVE-5018 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5018.1.patch.txt Object instantiation inside loops is very expensive. Where possible, object references should be created outside the loop so that they can be reused. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739435#comment-13739435 ] Amareshwari Sriramadasu commented on HIVE-4569: --- I think it makes sense to have two apis as JDBC drivers can call one with sync and other users interested in async can call async api. Though the documentation of execute() has to be changed to say that it is executed synchronously. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists
[ https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4645: Status: Patch Available (was: Open) Stat information like numFiles and totalSize is not correct when sub-directory is exists Key: HIVE-4645 URL: https://issues.apache.org/jira/browse/HIVE-4645 Project: Hive Issue Type: Test Components: Statistics Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4645.D11037.1.patch The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's size of parent directory, not sum of file size. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)
[ https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5019: --- Description: Issue 1 - use of StringBuilder over += inside loops. java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java java/org/apache/hadoop/hive/ql/plan/PlanUtils.java java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java java/org/apache/hadoop/hive/ql/udf/UDFLike.java java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java was: Issue 1 (use of StringBuilder over +=) java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java java/org/apache/hadoop/hive/ql/plan/PlanUtils.java java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java java/org/apache/hadoop/hive/ql/udf/UDFLike.java java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java Use StringBuffer instead of += (issue 1) Key: HIVE-5019 URL: https://issues.apache.org/jira/browse/HIVE-5019 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Fix For: 0.12.0 Attachments: HIVE-5019.2.patch.txt Issue 1 - use of StringBuilder over += inside loops. java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java java/org/apache/hadoop/hive/ql/plan/PlanUtils.java java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java java/org/apache/hadoop/hive/ql/udf/UDFLike.java java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5060) JDBC driver assumes executeStatement is synchronous
[ https://issues.apache.org/jira/browse/HIVE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739437#comment-13739437 ] Amareshwari Sriramadasu commented on HIVE-5060: --- @Henry, HIVE-4569 adds another api to call execute asynchronously. After that, current code of jdbc driver should just work. If we have a synchronous api, the clients such as jdbc can fetch results after the execute immediately without bombarding the server with so many get-status calls. So, i definitely see the need for two apis. JDBC driver assumes executeStatement is synchronous --- Key: HIVE-5060 URL: https://issues.apache.org/jira/browse/HIVE-5060 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.11.0 Reporter: Henry Robinson Fix For: 0.11.1, 0.12.0 Attachments: 0001-HIVE-5060-JDBC-driver-assumes-executeStatement-is-sy.patch The JDBC driver seems to assume that {{ExecuteStatement}} is a synchronous call when performing updates via {{executeUpdate}}, where the following comment on the RPC in the Thrift file indicates otherwise: {code} // ExecuteStatement() // // Execute a statement. // The returned OperationHandle can be used to check on the // status of the statement, and to fetch results once the // statement has finished executing. {code} I understand that Hive's implementation of {{ExecuteStatement}} is blocking (see https://issues.apache.org/jira/browse/HIVE-4569), but presumably other implementations of the HiveServer2 API (and I'm talking specifically about Impala here, but others might have a similar concern) should be free to return a pollable {{OperationHandle}} per the specification. The JDBC driver's {{executeUpdate}} is as follows: {code} public int executeUpdate(String sql) throws SQLException { execute(sql); return 0; } {code} {{execute(sql)}} discards the {{OperationHandle}} that it gets from the server after determining whether there are results to be fetched. This is problematic for us, because Impala will cancel queries that are running when a session executes, but there's no easy way to be sure that an {{INSERT}} statement has completed before terminating a session on the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists
[ https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4645: -- Attachment: HIVE-4645.D11037.2.patch navis updated the revision HIVE-4645 [jira] Stat information like numFiles and totalSize is not correct when sub-directory is exists. Fixed more stats on LB Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D11037 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11037?vs=34215id=37857#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java ql/src/test/results/clientpositive/infer_bucket_sort_list_bucket.q.out ql/src/test/results/clientpositive/list_bucket_dml_7.q.out ql/src/test/results/clientpositive/list_bucket_dml_8.q.out To: JIRA, navis Stat information like numFiles and totalSize is not correct when sub-directory is exists Key: HIVE-4645 URL: https://issues.apache.org/jira/browse/HIVE-4645 Project: Hive Issue Type: Test Components: Statistics Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4645.D11037.1.patch, HIVE-4645.D11037.2.patch The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's size of parent directory, not sum of file size. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC
[ https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739440#comment-13739440 ] Gunther Hagleitner commented on HIVE-4246: -- [~owen.omalley] The join1 test in the mini mr driver doesn't fail for me locally. I think that's unrelated. But the TestRecordReaderImpl test failure seems legit. Implement predicate pushdown for ORC Key: HIVE-4246 URL: https://issues.apache.org/jira/browse/HIVE-4246 Project: Hive Issue Type: New Feature Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch By using the push down predicates from the table scan operator, ORC can skip over 10,000 rows at a time that won't satisfy the predicate. This will help a lot, especially if the file is sorted by the column that is used in the predicate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5052) Set parallelism when generating the tez tasks
[ https://issues.apache.org/jira/browse/HIVE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739452#comment-13739452 ] Gunther Hagleitner commented on HIVE-5052: -- RB? - next power of two in java can be done in a simpler way: {code} j = Integer.higestOneBit(i); i==j return i : j1; {code} - We shouldn't change the default for BYTESPERREDUCER. If we want a different default for TEZ we should probably create a different var. - The comment should mention what happens with multi parent reduce-work - There seems to be some dead code at the end of the file - The setting of the var can be broken into separate method in the class - If the reducesink specifies a specific number of reducers, do we need to carry that number through additional stages? Right now you will add other stuff to it during the walk. Set parallelism when generating the tez tasks - Key: HIVE-5052 URL: https://issues.apache.org/jira/browse/HIVE-5052 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5052.1.patch.txt, HIVE-5052.2.patch.txt In GenTezTask any intermediate task has parallelism set to 1. This needs to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5044) StringUtils
[ https://issues.apache.org/jira/browse/HIVE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739480#comment-13739480 ] Benjamin Jakobus commented on HIVE-5044: Out of interest, is there any performance difference when using StringUtil.join()? Or is it just to make things neater? StringUtils --- Key: HIVE-5044 URL: https://issues.apache.org/jira/browse/HIVE-5044 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Fix For: 0.12.0 When you see code like this: first = true; for (int k = 0; k columnSize; k++) { String newColName = i + VALUE + k; // any name, it does not matter. + newColName = i + VALUE + k; // any name, it does not matter. if (!first) Unknown macro: { - valueColNames = valueColNames + ,; - valueColTypes = valueColTypes + ,; + valueColNames.append(,); + valueColTypes.append(,); } valueColNames = valueColNames + newColName; valueColTypes = valueColTypes + valueCols.get(k).getTypeString(); + valueColNames.append(newColName); + valueColTypes.append(valueCols.get(k).getTypeString()); first = false; Can you replace it with StringUtil.join() I have seen this about 4 places in hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739502#comment-13739502 ] Hive QA commented on HIVE-4003: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597856/HIVE-4003.patch {color:green}SUCCESS:{color} +1 2856 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/432/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/432/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)
[ https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Jakobus updated HIVE-5019: --- Attachment: HIVE-5019.3.patch.txt Use StringBuffer instead of += (issue 1) Key: HIVE-5019 URL: https://issues.apache.org/jira/browse/HIVE-5019 Project: Hive Issue Type: Sub-task Reporter: Benjamin Jakobus Assignee: Benjamin Jakobus Fix For: 0.12.0 Attachments: HIVE-5019.2.patch.txt, HIVE-5019.3.patch.txt Issue 1 - use of StringBuilder over += inside loops. java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java java/org/apache/hadoop/hive/ql/plan/PlanUtils.java java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java java/org/apache/hadoop/hive/ql/udf/UDFLike.java java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3173) implement getTypeInfo database metadata method
[ https://issues.apache.org/jira/browse/HIVE-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739593#comment-13739593 ] Hive QA commented on HIVE-3173: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597865/Hive-3173.patch.txt {color:green}SUCCESS:{color} +1 2857 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/433/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/433/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. implement getTypeInfo database metadata method --- Key: HIVE-3173 URL: https://issues.apache.org/jira/browse/HIVE-3173 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.8.1 Reporter: N Campbell Attachments: Hive-3173.patch.txt The JDBC driver does not implement the database metadata method getTypeInfo. Hence, an application cannot dynamically determine the available type information and associated properties. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739644#comment-13739644 ] Sean Busbey commented on HIVE-4789: --- [~brocknoland], I haven't had time to put together a test to hit the MetaStoreUtil changes and it seems unlikely I will this week. Given that, I think it's probably a good idea to break those changes into a different ticket. FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4985) refactor/clean up partition name pruning to be usable inside metastore server
[ https://issues.apache.org/jira/browse/HIVE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739686#comment-13739686 ] Hudson commented on HIVE-4985: -- SUCCESS: Integrated in Hive-trunk-h0.21 #2267 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2267/]) HIVE-4985 : refactor/clean up partition name pruning to be usable inside metastore server (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513596) * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/listbucketingpruner/ListBucketingPruner.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/PrunedPartitionList.java refactor/clean up partition name pruning to be usable inside metastore server -- Key: HIVE-4985 URL: https://issues.apache.org/jira/browse/HIVE-4985 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-4985.D11961.1.patch, HIVE-4985.D11961.2.patch, HIVE-4985.D11961.3.patch, HIVE-4985.D11961.4.patch, HIVE-4985.D11961.5.patch Preliminary for HIVE-4914. The patch is going to be large already, so some refactoring and dead code removal that is non-controversial can be done in advance in a separate patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739684#comment-13739684 ] Hive QA commented on HIVE-5022: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597894/HIVE-5022.2.patch.txt {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 2856 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testUnionAndTimestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/434/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/434/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-5022 started by Teddy Choi. Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-5022: - Status: Open (was: Patch Available) Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739722#comment-13739722 ] Teddy Choi commented on HIVE-5022: -- It affected some arithmetic results. I'll update it. :) Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4705) PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql
[ https://issues.apache.org/jira/browse/HIVE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4705. Resolution: Not A Problem Your problem is a valid one, ant command you are expecting to work, should work. Its unfortunate that our build system doesnt let it run as accepted. But, moving test classes to src package isn't acceptable solution to fix this problem. We need to enhance our build system to make above ant command work. PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql - Key: HIVE-4705 URL: https://issues.apache.org/jira/browse/HIVE-4705 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-4705.D11205.1.patch Currently included in ql-test but is referenced from tests in other modules. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HIVE-4705) PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql
[ https://issues.apache.org/jira/browse/HIVE-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739757#comment-13739757 ] Ashutosh Chauhan edited comment on HIVE-4705 at 8/14/13 3:13 PM: - Your problem is a valid one, ant command you are expecting to work, should work. Its unfortunate that our build system doesnt let it run as expected. But, moving test classes to src package isn't acceptable solution to fix this problem. We need to enhance our build system to make above ant command work. was (Author: ashutoshc): Your problem is a valid one, ant command you are expecting to work, should work. Its unfortunate that our build system doesnt let it run as accepted. But, moving test classes to src package isn't acceptable solution to fix this problem. We need to enhance our build system to make above ant command work. PreExecutePrinter, EnforceReadOnlyTables, PostExecutePrinter should be included in ql - Key: HIVE-4705 URL: https://issues.apache.org/jira/browse/HIVE-4705 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-4705.D11205.1.patch Currently included in ql-test but is referenced from tests in other modules. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or
[ https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5047: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Sergey! Hive client filters partitions incorrectly via pushdown in certain cases involving or --- Key: HIVE-5047 URL: https://issues.apache.org/jira/browse/HIVE-5047 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-5047.D12141.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5068) Some queries fail due to XMLEncoder error on JDK7
[ https://issues.apache.org/jira/browse/HIVE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739765#comment-13739765 ] Brock Noland commented on HIVE-5068: I wrote a quick and simple change to use plain old java serialization and hit the error below. My guess is we'll have to mark some more stuff transient to do this. {noformat} Caused by: java.io.NotSerializableException: org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1181) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at java.util.ArrayList.writeObject(ArrayList.java:710) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at java.util.ArrayList.writeObject(ArrayList.java:710) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at java.util.HashMap.writeObject(HashMap.java:1099) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} Some queries fail due to XMLEncoder error on JDK7 - Key: HIVE-5068 URL: https://issues.apache.org/jira/browse/HIVE-5068 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Looks like something snuck in that breaks the JDK 7 build: {noformat} Caused by: java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(ASTNode); ... 106 more Caused by: java.lang.RuntimeException: Cannot serialize object at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:598) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:238) at java.beans.DefaultPersistenceDelegate.initialize(DefaultPersistenceDelegate.java:400) at java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:118) at java.beans.Encoder.writeObject(Encoder.java:74) at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327) at java.beans.Encoder.writeExpression(Encoder.java:330) at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454) at java.beans.PersistenceDelegate.writeObject(PersistenceDelegate.java:115)
[jira] [Assigned] (HIVE-5069) Tests on list bucketing are failing again in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-5069: -- Assignee: Sergey Shelukhin (was: Navis) [~sershe] Assigning this to you. We want to make sure the fix you are suggesting of adding order by in sql query doesn't have any -ve perf impact. Or, is there a better fix without involving sql change than whats currently in the patch? Tests on list bucketing are failing again in hadoop2 Key: HIVE-5069 URL: https://issues.apache.org/jira/browse/HIVE-5069 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Navis Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5069.D12201.1.patch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5089) Non query PreparedStatements are always failing on remote HiveServer2
Julien Letrouit created HIVE-5089: - Summary: Non query PreparedStatements are always failing on remote HiveServer2 Key: HIVE-5089 URL: https://issues.apache.org/jira/browse/HIVE-5089 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.11.0 Reporter: Julien Letrouit This is reproducing the issue systematically: import org.apache.hive.jdbc.HiveDriver; import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; public class Main { public static void main(String[] args) throws Exception { DriverManager.registerDriver(new HiveDriver()); Connection conn = DriverManager.getConnection(jdbc:hive2://someserver); PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1); smt.execute(); // Exception here conn.close(); } } It is producing the following stacktrace: Exception in thread main java.sql.SQLException: Could not create ResultSet: null at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183) at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122) at org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194) at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137) at Main.main(Main.java:12) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetResultSetMetadata(TCLIService.java:466) at org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:453) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:154) ... 5 more I tried to fix it, unfortunately, the standalone server used in unit tests do not reproduce the issue. The following test added to TestJdbcDriver2 is passing: public void testNonQueryPrepareStatement() throws Exception { try { PreparedStatement ps = con.prepareStatement(SET hivevar:test=1); boolean hasResultSet = ps.execute(); assertTrue(hasResultSet); ps.close(); } catch (Exception e) { e.printStackTrace(); fail(e.toString()); } } Any guidance on how to reproduce it in tests would be appreciated. Impact: the data analysis tools we are using are performing PreparedStatements. The use of custom UDF is forcing us to add 'ADD JAR ...' and 'CREATE TEMPORARY FUNCTION ...' statement to our query. Those statements are failing when executed as PreparedStatements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5089) Non query PreparedStatements are always failing on remote HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Letrouit updated HIVE-5089: -- Description: This is reproducing the issue systematically: {noformat} import org.apache.hive.jdbc.HiveDriver; import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; public class Main { public static void main(String[] args) throws Exception { DriverManager.registerDriver(new HiveDriver()); Connection conn = DriverManager.getConnection(jdbc:hive2://someserver); PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1); smt.execute(); // Exception here conn.close(); } } {noformat} It is producing the following stacktrace: {noformat} Exception in thread main java.sql.SQLException: Could not create ResultSet: null at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183) at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122) at org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194) at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137) at Main.main(Main.java:12) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetResultSetMetadata(TCLIService.java:466) at org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:453) at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:154) ... 5 more {noformat} I tried to fix it, unfortunately, the standalone server used in unit tests do not reproduce the issue. The following test added to TestJdbcDriver2 is passing: {noformat} public void testNonQueryPrepareStatement() throws Exception { try { PreparedStatement ps = con.prepareStatement(SET hivevar:test=1); boolean hasResultSet = ps.execute(); assertTrue(hasResultSet); ps.close(); } catch (Exception e) { e.printStackTrace(); fail(e.toString()); } } {noformat} Any guidance on how to reproduce it in tests would be appreciated. Impact: the data analysis tools we are using are performing PreparedStatements. The use of custom UDF is forcing us to add 'ADD JAR ...' and 'CREATE TEMPORARY FUNCTION ...' statement to our query. Those statements are failing when executed as PreparedStatements. was: This is reproducing the issue systematically: import org.apache.hive.jdbc.HiveDriver; import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; public class Main { public static void main(String[] args) throws Exception { DriverManager.registerDriver(new HiveDriver()); Connection conn = DriverManager.getConnection(jdbc:hive2://someserver); PreparedStatement smt = conn.prepareStatement(SET hivevar:test=1); smt.execute(); // Exception here conn.close(); } } It is producing the following stacktrace: Exception in thread main java.sql.SQLException: Could not create ResultSet: null at org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183) at org.apache.hive.jdbc.HiveQueryResultSet.init(HiveQueryResultSet.java:134) at org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122) at org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194) at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137) at Main.main(Main.java:12) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) at
[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.
[ https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739774#comment-13739774 ] Ashutosh Chauhan commented on HIVE-5048: I think your checks are masking underlying problem. IMO correct fix for this is that Warehouse should always be initialized. If it so happens that metastore is up and warehouse isn't, thats illegal state, doesn't matter if calls are made from client or server. This has been discussed before as well : https://issues.apache.org/jira/browse/HIVE-2079?focusedCommentId=13104063page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13104063 StorageBasedAuthorization provider causes an NPE when asked to authorize from client side. -- Key: HIVE-5048 URL: https://issues.apache.org/jira/browse/HIVE-5048 Project: Hive Issue Type: Bug Components: Security Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5048.patch StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705. As long as it's used as a HMAP, i.e. from the metastore-side, as was its initial implementation intent, everything's great. However, HMAP extends HAP, and there is no reason SBAP shouldn't be expected to work as a HAP as well. However, it uses a wh variable that is never initialized if it is called as a HAP, and hence, it will always fail when authorize is called on it. We should change SBAP so that it correctly initiazes wh so that it can be run as a HAP as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed
[ https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5082: -- Fix Version/s: 0.12.0 Status: Patch Available (was: Open) Beeline usage is printed twice when beeline --help is executed Key: HIVE-5082 URL: https://issues.apache.org/jira/browse/HIVE-5082 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5082.patch {code} bin/beeline --help /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression expected Listening for transport dt_socket at address: 8000 Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed
[ https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5082: -- Attachment: HIVE-5082.patch Beeline usage is printed twice when beeline --help is executed Key: HIVE-5082 URL: https://issues.apache.org/jira/browse/HIVE-5082 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-5082.patch {code} bin/beeline --help /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression expected Listening for transport dt_socket at address: 8000 Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed
[ https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739784#comment-13739784 ] Brock Noland commented on HIVE-5082: +1 pending tests Beeline usage is printed twice when beeline --help is executed Key: HIVE-5082 URL: https://issues.apache.org/jira/browse/HIVE-5082 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5082.patch {code} bin/beeline --help /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression expected Listening for transport dt_socket at address: 8000 Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or
[ https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739789#comment-13739789 ] Hudson commented on HIVE-5047: -- FAILURE: Integrated in Hive-trunk-hadoop2 #359 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/359/]) HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java * /hive/trunk/ql/src/test/queries/clientpositive/push_or.q * /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out Hive client filters partitions incorrectly via pushdown in certain cases involving or --- Key: HIVE-5047 URL: https://issues.apache.org/jira/browse/HIVE-5047 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-5047.D12141.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW
[ https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739783#comment-13739783 ] Hive QA commented on HIVE-2608: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597899/HIVE-2608.D4317.8.patch {color:green}SUCCESS:{color} +1 2856 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/435/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/435/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Do not require AS a,b,c part in LATERAL VIEW Key: HIVE-2608 URL: https://issues.apache.org/jira/browse/HIVE-2608 Project: Hive Issue Type: Improvement Components: Query Processor, UDF Reporter: Igor Kabiljo Assignee: Navis Priority: Minor Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch Currently, it is required to state column names when LATERAL VIEW is used. That shouldn't be necessary, since UDTF returns struct which contains column names - and they should be used by default. For example, it would be great if this was possible: SELECT t.*, t.key1 + t.key4 FROM some_table LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739810#comment-13739810 ] Leo Romanoff commented on HIVE-1511: @Ashutosh: I tried out your latest patch. My results and conclusions are: 1) auto_sortmerge_join-*.q failure: it seems like copyMRWork method still uses XML serializer instead of Kryo based on the stacktrace of exception that I get 2) bucketcontext_*.q fails because it seems to produce wrong numeric results. And test compares expected number to the one delivered by the test run. So, it seems to be a semantic error, not a usual exception during (de)serialization. 3) I tried randomly some of smb_mapjoin_*.q tests. All of them seem to finish successfully. Regarding reporting problems: It would be nice if reports would provide exceptions with stacktraces and may be other information that could be useful to identify real problems. It helps a lot. -Leo Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Assignee: Mohammad Kamrul Islam Attachments: HIVE-1511.patch, HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip.patch As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5059) Meaningless warning message from TypeCheckProcFactory
[ https://issues.apache.org/jira/browse/HIVE-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5059: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Meaningless warning message from TypeCheckProcFactory - Key: HIVE-5059 URL: https://issues.apache.org/jira/browse/HIVE-5059 Project: Hive Issue Type: Task Components: Logging Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-5059.D12159.1.patch Regression from HIVE-3849, hive logs meaningless messages as warning like below, {noformat} WARN parse.TypeCheckProcFactory (TypeCheckProcFactory.java:convert(180)) - Invalid type entry TOK_TABLE_OR_COL=null {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5062) Insert + orderby + limit does not need additional RS for limiting rows
[ https://issues.apache.org/jira/browse/HIVE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5062: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Insert + orderby + limit does not need additional RS for limiting rows -- Key: HIVE-5062 URL: https://issues.apache.org/jira/browse/HIVE-5062 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-5062.D12171.1.patch The query, {noformat} insert overwrite table dummy select * from src order by key limit 10; {noformat} runs two MR but single MR is enough. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3189) cast ( string type as bigint) returning null values
[ https://issues.apache.org/jira/browse/HIVE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3189: --- Assignee: Xiu cast ( string type as bigint) returning null values - Key: HIVE-3189 URL: https://issues.apache.org/jira/browse/HIVE-3189 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: N Campbell Assignee: Xiu Attachments: Hive-3189.patch.txt select rnum, c1, cast(c1 as bigint) from cert.tsdchar tsdchar where rnum in (0,1,2) create table if not exists CERT.TSDCHAR ( RNUM int , C1 string) row format sequencefile rnum c1 _c2 0 -1 null 1 0 null 2 10 null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or
[ https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739839#comment-13739839 ] Hudson commented on HIVE-5047: -- FAILURE: Integrated in Hive-trunk-h0.21 #2268 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2268/]) HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java * /hive/trunk/ql/src/test/queries/clientpositive/push_or.q * /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out Hive client filters partitions incorrectly via pushdown in certain cases involving or --- Key: HIVE-5047 URL: https://issues.apache.org/jira/browse/HIVE-5047 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-5047.D12141.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739841#comment-13739841 ] Ashutosh Chauhan commented on HIVE-1511: [~romixlev] Thanks for taking a look. Appreciate your continued help. The reason I have not reported these failures on kryo list is exactly the same as you have identified. I am not yet sure that these failures are because of bugs in kryo. We need to do more digging at our end to validate that our usage of Kryo is correct and the patch is correct as well. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Assignee: Mohammad Kamrul Islam Attachments: HIVE-1511.patch, HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip.patch As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3189) cast ( string type as bigint) returning null values
[ https://issues.apache.org/jira/browse/HIVE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3189: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Xiu for testcase! cast ( string type as bigint) returning null values - Key: HIVE-3189 URL: https://issues.apache.org/jira/browse/HIVE-3189 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: N Campbell Assignee: Xiu Fix For: 0.12.0 Attachments: Hive-3189.patch.txt select rnum, c1, cast(c1 as bigint) from cert.tsdchar tsdchar where rnum in (0,1,2) create table if not exists CERT.TSDCHAR ( RNUM int , C1 string) row format sequencefile rnum c1 _c2 0 -1 null 1 0 null 2 10 null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5061) Row sampling throws NPE when used in sub-query
[ https://issues.apache.org/jira/browse/HIVE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5061: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Row sampling throws NPE when used in sub-query -- Key: HIVE-5061 URL: https://issues.apache.org/jira/browse/HIVE-5061 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5061.D12165.1.patch select * from (select * from src TABLESAMPLE (1 ROWS)) x; {noformat} ava.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SplitSample.getTargetSize(SplitSample.java:103) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.sampleSplits(CombineHiveInputFormat.java:487) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:405) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1025) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1017) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:928) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:881) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:881) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:855) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:144) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1424) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1204) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:878) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5032) Enable hive creating external table at the root directory of DFS
[ https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739903#comment-13739903 ] Mostafa Elhemali commented on HIVE-5032: +1 from me as well - thanks [~shuainie]. Enable hive creating external table at the root directory of DFS Key: HIVE-5032 URL: https://issues.apache.org/jira/browse/HIVE-5032 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Attachments: HIVE-5032.1.patch Creating external table using HIVE with location point to the root directory of DFS will fail because the function HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path the same as folder and cannot find a match in the pathToPartitionInfo table when doing prefix match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Adding to the hive contributor list
Hi, I would like to get added to contributor list. Thanks Hari -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739924#comment-13739924 ] Sergey Shelukhin commented on HIVE-5069: SQL query change shouldn't affect performance Tests on list bucketing are failing again in hadoop2 Key: HIVE-5069 URL: https://issues.apache.org/jira/browse/HIVE-5069 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Navis Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5069.D12201.1.patch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5083) Group by ignored when group by column is a partition column
[ https://issues.apache.org/jira/browse/HIVE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739951#comment-13739951 ] Micah Gutman commented on HIVE-5083: Finally found the bug by using show extended table partition spec to figure out that all partitions were pointing to a single file. My selects only looked like they were working, they were just reading the same data over and over. Specifically, I created my partitions with alter table using multiple partition specs in the same command. Interestingly, the wiki page help said: Note that it is proper syntax to have multiple partition_spec in a single ALTER TABLE, but if you do this in version 0.7, your partitioning scheme will fail. That is, every query specifying a partition will always use only the first partition. I am using 0.11, not 0.7. Apparently, 0.11 (and perhaps everything after 0.7?) has this problem. Group by ignored when group by column is a partition column --- Key: HIVE-5083 URL: https://issues.apache.org/jira/browse/HIVE-5083 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: linux Reporter: Micah Gutman I have an external table X with partition date (a string MMDD): select X.date, count(*) from X group by X.date Rather then get a count breakdown by date, I get a single row returned with the count for the entire table. The date column returned in my single row appears to be the last partition in the table. Note results appear as expected if I select an arbitrary real column from my table: select X.foo, count(*) from X group by X.foo correctly gives me a single row per value of X.foo. Also, my query works fine when I use the date column in the where clause, so the partition does seem to be working. select X.date, count(*) from X where X.date = 20130101 correctly gives me a single row with the count for the date 20130101. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-5083) Group by ignored when group by column is a partition column
[ https://issues.apache.org/jira/browse/HIVE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Gutman resolved HIVE-5083. Resolution: Not A Problem The reported problem is just a symptom of a different known bug. Group by ignored when group by column is a partition column --- Key: HIVE-5083 URL: https://issues.apache.org/jira/browse/HIVE-5083 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: linux Reporter: Micah Gutman I have an external table X with partition date (a string MMDD): select X.date, count(*) from X group by X.date Rather then get a count breakdown by date, I get a single row returned with the count for the entire table. The date column returned in my single row appears to be the last partition in the table. Note results appear as expected if I select an arbitrary real column from my table: select X.foo, count(*) from X group by X.foo correctly gives me a single row per value of X.foo. Also, my query works fine when I use the date column in the where clause, so the partition does seem to be working. select X.date, count(*) from X where X.date = 20130101 correctly gives me a single row with the count for the date 20130101. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Adding to the hive contributor list
Done. Welcome Hari to the project. Thanks, Ashutosh On Wed, Aug 14, 2013 at 10:32 AM, Hari Subramaniyan hsubramani...@hortonworks.com wrote: Hi, I would like to get added to contributor list. Thanks Hari -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739957#comment-13739957 ] Ashutosh Chauhan commented on HIVE-5069: So, do you think patch is good enough as in current state or you want to make changes you suggested ? Tests on list bucketing are failing again in hadoop2 Key: HIVE-5069 URL: https://issues.apache.org/jira/browse/HIVE-5069 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Navis Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5069.D12201.1.patch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5047) Hive client filters partitions incorrectly via pushdown in certain cases involving or
[ https://issues.apache.org/jira/browse/HIVE-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739963#comment-13739963 ] Hudson commented on HIVE-5047: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #57 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/57/]) HIVE-5047 : Hive client filters partitions incorrectly via pushdown in certain cases involving or (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513926) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java * /hive/trunk/ql/src/test/queries/clientpositive/push_or.q * /hive/trunk/ql/src/test/results/clientpositive/push_or.q.out Hive client filters partitions incorrectly via pushdown in certain cases involving or --- Key: HIVE-5047 URL: https://issues.apache.org/jira/browse/HIVE-5047 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-5047.D12141.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4985) refactor/clean up partition name pruning to be usable inside metastore server
[ https://issues.apache.org/jira/browse/HIVE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739962#comment-13739962 ] Hudson commented on HIVE-4985: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #57 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/57/]) HIVE-4985 : refactor/clean up partition name pruning to be usable inside metastore server (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1513596) * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/listbucketingpruner/ListBucketingPruner.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/PrunedPartitionList.java refactor/clean up partition name pruning to be usable inside metastore server -- Key: HIVE-4985 URL: https://issues.apache.org/jira/browse/HIVE-4985 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: HIVE-4985.D11961.1.patch, HIVE-4985.D11961.2.patch, HIVE-4985.D11961.3.patch, HIVE-4985.D11961.4.patch, HIVE-4985.D11961.5.patch Preliminary for HIVE-4914. The patch is going to be large already, so some refactoring and dead code removal that is non-controversial can be done in advance in a separate patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5090) Remove unwanted file from the trunk.
Ashutosh Chauhan created HIVE-5090: -- Summary: Remove unwanted file from the trunk. Key: HIVE-5090 URL: https://issues.apache.org/jira/browse/HIVE-5090 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Seems like ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.orig got accidentally checked in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory
[ https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5055: Assignee: Hari Sankar Sivarama Subramaniyan SessionState temp file gets created in history file directory - Key: HIVE-5055 URL: https://issues.apache.org/jira/browse/HIVE-5055 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan SessionState.start creates a temp file for temp results, but this file is created in hive.querylog.location, which supposed to be used only for hive history log files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739973#comment-13739973 ] Sergey Shelukhin commented on HIVE-5069: Let me provide potential alternative patch shortly, I am checking that is builds/passes some tests Tests on list bucketing are failing again in hadoop2 Key: HIVE-5069 URL: https://issues.apache.org/jira/browse/HIVE-5069 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Navis Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5069.D12201.1.patch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4246) Implement predicate pushdown for ORC
[ https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4246: -- Attachment: HIVE-4246.D11415.5.patch omalley updated the revision HIVE-4246 [jira] Implement predicate pushdown for ORC. updated expected test results Reviewers: hagleitn, JIRA REVISION DETAIL https://reviews.facebook.net/D11415 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11415?vs=37767id=37875#toc BRANCH h-4246 ARCANIST PROJECT hive AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/BitFieldReader.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/InStream.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/RunLengthByteReader.java ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitFieldReader.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitPack.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInStream.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestIntegerCompressionReader.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthByteReader.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthIntegerReader.java ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java ql/src/test/results/compiler/plan/case_sensitivity.q.xml ql/src/test/results/compiler/plan/cast1.q.xml ql/src/test/results/compiler/plan/groupby1.q.xml ql/src/test/results/compiler/plan/groupby2.q.xml ql/src/test/results/compiler/plan/groupby3.q.xml ql/src/test/results/compiler/plan/groupby4.q.xml ql/src/test/results/compiler/plan/groupby5.q.xml ql/src/test/results/compiler/plan/groupby6.q.xml ql/src/test/results/compiler/plan/input1.q.xml ql/src/test/results/compiler/plan/input2.q.xml ql/src/test/results/compiler/plan/input20.q.xml ql/src/test/results/compiler/plan/input3.q.xml ql/src/test/results/compiler/plan/input4.q.xml ql/src/test/results/compiler/plan/input5.q.xml ql/src/test/results/compiler/plan/input6.q.xml ql/src/test/results/compiler/plan/input7.q.xml ql/src/test/results/compiler/plan/input8.q.xml ql/src/test/results/compiler/plan/input9.q.xml ql/src/test/results/compiler/plan/input_part1.q.xml ql/src/test/results/compiler/plan/input_testsequencefile.q.xml ql/src/test/results/compiler/plan/input_testxpath.q.xml ql/src/test/results/compiler/plan/input_testxpath2.q.xml ql/src/test/results/compiler/plan/join1.q.xml ql/src/test/results/compiler/plan/join2.q.xml ql/src/test/results/compiler/plan/join3.q.xml ql/src/test/results/compiler/plan/join4.q.xml ql/src/test/results/compiler/plan/join5.q.xml ql/src/test/results/compiler/plan/join6.q.xml ql/src/test/results/compiler/plan/join7.q.xml ql/src/test/results/compiler/plan/join8.q.xml ql/src/test/results/compiler/plan/sample1.q.xml ql/src/test/results/compiler/plan/sample2.q.xml ql/src/test/results/compiler/plan/sample3.q.xml ql/src/test/results/compiler/plan/sample4.q.xml ql/src/test/results/compiler/plan/sample5.q.xml ql/src/test/results/compiler/plan/sample6.q.xml ql/src/test/results/compiler/plan/sample7.q.xml ql/src/test/results/compiler/plan/subq.q.xml ql/src/test/results/compiler/plan/udf1.q.xml ql/src/test/results/compiler/plan/udf4.q.xml ql/src/test/results/compiler/plan/udf6.q.xml ql/src/test/results/compiler/plan/udf_case.q.xml ql/src/test/results/compiler/plan/udf_when.q.xml ql/src/test/results/compiler/plan/union.q.xml serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java To: JIRA, hagleitn, omalley Cc: hagleitn Implement predicate pushdown for ORC Key: HIVE-4246 URL: https://issues.apache.org/jira/browse/HIVE-4246 Project: Hive Issue Type: New Feature Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4246.D11415.1.patch, HIVE-4246.D11415.2.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.3.patch, HIVE-4246.D11415.4.patch, HIVE-4246.D11415.5.patch By using the push down
[jira] [Work started] (HIVE-5055) SessionState temp file gets created in history file directory
[ https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-5055 started by Hari Sankar Sivarama Subramaniyan. SessionState temp file gets created in history file directory - Key: HIVE-5055 URL: https://issues.apache.org/jira/browse/HIVE-5055 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan SessionState.start creates a temp file for temp results, but this file is created in hive.querylog.location, which supposed to be used only for hive history log files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory
[ https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5055: Attachment: HIVE-5055.1.patch.txt SessionState temp file gets created in history file directory - Key: HIVE-5055 URL: https://issues.apache.org/jira/browse/HIVE-5055 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5055.1.patch.txt SessionState.start creates a temp file for temp results, but this file is created in hive.querylog.location, which supposed to be used only for hive history log files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5055) SessionState temp file gets created in history file directory
[ https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5055: Status: Patch Available (was: In Progress) Changed the SessionState temp file location to local scratch directory. SessionState temp file gets created in history file directory - Key: HIVE-5055 URL: https://issues.apache.org/jira/browse/HIVE-5055 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5055.1.patch.txt SessionState.start creates a temp file for temp results, but this file is created in hive.querylog.location, which supposed to be used only for hive history log files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.
[ https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740017#comment-13740017 ] Sushanth Sowmyan commented on HIVE-5048: Ashutosh, I'm afraid I don't understand what problem I'm masking, the fix I add is doing exactly that - making sure a wh variable is always instantiated. In the current form, SBAP is usable only from the metastore-side, and wasn't initially written to be initializable from the client-side. When initialized from the metastore, the setMetaStoreHandler call is called, and the wh variable is initialized from handler.getWh() and all is good. This patch addresses the case where it is called from the client-side, in which case we do not have a wh object (which we were previously getting from the metastore) and so, with this patch, we initialize it. I added the else{} block for the sake of completeness/documentation, but realistically, that can never/will never be entered. I can change the MetaException to an IllegalStateException to make that more strict, if that's what you mean. StorageBasedAuthorization provider causes an NPE when asked to authorize from client side. -- Key: HIVE-5048 URL: https://issues.apache.org/jira/browse/HIVE-5048 Project: Hive Issue Type: Bug Components: Security Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5048.patch StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705. As long as it's used as a HMAP, i.e. from the metastore-side, as was its initial implementation intent, everything's great. However, HMAP extends HAP, and there is no reason SBAP shouldn't be expected to work as a HAP as well. However, it uses a wh variable that is never initialized if it is called as a HAP, and hence, it will always fail when authorize is called on it. We should change SBAP so that it correctly initiazes wh so that it can be run as a HAP as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5069) Tests on list bucketing are failing again in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-5069: -- Attachment: HIVE-5069.D12243.1.patch sershe requested code review of HIVE-5069 [jira] Tests on list bucketing are failing again in hadoop2. Reviewers: JIRA Initial patch, I am trying to provide this quickly so I only ran one query... running others now. org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12243 AFFECTED FILES metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/29271/ To: JIRA, sershe Tests on list bucketing are failing again in hadoop2 Key: HIVE-5069 URL: https://issues.apache.org/jira/browse/HIVE-5069 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Navis Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-5069.D12201.1.patch, HIVE-5069.D12243.1.patch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
Owen O'Malley created HIVE-5091: --- Summary: ORC files should have an option to pad stripes to the HDFS block boundaries Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-5091: -- Attachment: HIVE-5091.D12249.1.patch omalley requested code review of HIVE-5091 [jira] ORC files should have an option to pad stripes to the HDFS block boundaries. Reviewers: JIRA pad blocks out to the hdfs block boundaries With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D12249 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/29277/ To: JIRA, omalley ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-5091.D12249.1.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4423) Improve RCFile::sync(long) 10x
[ https://issues.apache.org/jira/browse/HIVE-4423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740051#comment-13740051 ] Gopal V commented on HIVE-4423: --- Good catch [~taguswang], it is in fact missing 1 byte at the end. Please log a new bug assign it to me - I will fix this and add an extra test-case for this off-by-one error. Improve RCFile::sync(long) 10x -- Key: HIVE-4423 URL: https://issues.apache.org/jira/browse/HIVE-4423 Project: Hive Issue Type: Improvement Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM) Reporter: Gopal V Assignee: Gopal V Priority: Minor Labels: optimization Fix For: 0.12.0 Attachments: HIVE-4423.patch RCFile::sync(long) takes approx ~1 second everytime it gets called because of the inner loops in the function. From what was observed with HDFS-4710, single byte reads are an order of magnitude slower than larger 512 byte buffer reads. Even when disk I/O is buffered to this size, there is overhead due to the synchronized read() methods in BlockReaderLocal RemoteBlockReader classes. Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte) call will speed this function 10x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740052#comment-13740052 ] Owen O'Malley commented on HIVE-5091: - This patch: * Adds a new table property orc.block.padding, which defaults to true. * For stripes smaller than a block, if they would straddle the block boundary, zeros are written to get to the start of the next block. * The max block size is set to 1.5GB since 2GB - 1 created issues with blocksizes needing to be divisible by the checksum length (512). * Cleans up the interface to the OrcFile.createWriter so that the user can set parameters by name. * Cleans up the ability to write the 0.11 version of ORC files that was added in HIVE-4123. Ensures that the direct string encoding isn't used for 0.11 ORC files. * Updated most of the tests to use the new createWriter API. ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-5091.D12249.1.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows
Daniel Dai created HIVE-5092: Summary: Fix hiveserver2 mapreduce local job on Windows Key: HIVE-5092 URL: https://issues.apache.org/jira/browse/HIVE-5092 Project: Hive Issue Type: Bug Components: HiveServer2, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Hiveserver2 fail on Mapreduce local job fail. For example: {code} select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v on (s.name = v.name); {code} The root cause is class not found in the local hadoop job (MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib. Set HADOOP_CLASSPATH correctly will fix the issue. However, there is one complexity in Windows. We start Hiveserver2 using Windows service console (services.msc), which takes hiveserver2.xml generated by hive.cmd. There is no way to pass environment variable in hiveserver2.xml (weird but reality). I attach a patch which pass it through command line arguments and relay to HADOOP_CLASSPATH in Hive code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows
[ https://issues.apache.org/jira/browse/HIVE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5092: - Attachment: HIVE-5092-1.patch Fix hiveserver2 mapreduce local job on Windows -- Key: HIVE-5092 URL: https://issues.apache.org/jira/browse/HIVE-5092 Project: Hive Issue Type: Bug Components: HiveServer2, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5092-1.patch Hiveserver2 fail on Mapreduce local job fail. For example: {code} select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v on (s.name = v.name); {code} The root cause is class not found in the local hadoop job (MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib. Set HADOOP_CLASSPATH correctly will fix the issue. However, there is one complexity in Windows. We start Hiveserver2 using Windows service console (services.msc), which takes hiveserver2.xml generated by hive.cmd. There is no way to pass environment variable in hiveserver2.xml (weird but reality). I attach a patch which pass it through command line arguments and relay to HADOOP_CLASSPATH in Hive code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12827: HIVE-4611 - SMB joins fail based on bigtable selection policy.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12827/ --- (Updated Aug. 14, 2013, 7:21 p.m.) Review request for hive, Ashutosh Chauhan, Brock Noland, and Gunther Hagleitner. Changes --- Addressed Ashutosh's comments. Bugs: HIVE-4611 https://issues.apache.org/jira/browse/HIVE-4611 Repository: hive-git Description --- SMB joins fail based on bigtable selection policy. The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 12e9334 ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java fda2f84 ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java 1bed28f ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java db5ff0f ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java db3c9e7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java cd1b4ad ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java d33ea91 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java 3071713 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java e214807 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java da5115b ql/src/test/queries/clientnegative/auto_sortmerge_join_1.q c858254 ql/src/test/queries/clientpositive/auto_sortmerge_join_15.q PRE-CREATION ql/src/test/results/clientnegative/auto_sortmerge_join_1.q.out 0eddb69 ql/src/test/results/clientnegative/smb_bucketmapjoin.q.out 7a5b8c1 ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out PRE-CREATION Diff: https://reviews.apache.org/r/12827/diff/ Testing --- All tests pass on hadoop 1. Thanks, Vikram Dixit Kumaraswamy
[jira] [Updated] (HIVE-4611) SMB joins fail based on bigtable selection policy.
[ https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-4611: - Attachment: HIVE-4611.6.patch.txt Addressed Ashutosh's comments. Initially, I had an iterator over the list and the javadoc of that method also said list. However, I see that I don't really need an array list for the same. I have changed code to reflect the same. SMB joins fail based on bigtable selection policy. -- Key: HIVE-4611 URL: https://issues.apache.org/jira/browse/HIVE-4611 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.11.1 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, HIVE-4611.5.patch.txt, HIVE-4611.6.patch.txt, HIVE-4611.patch The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5082) Beeline usage is printed twice when beeline --help is executed
[ https://issues.apache.org/jira/browse/HIVE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740088#comment-13740088 ] Hive QA commented on HIVE-5082: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597987/HIVE-5082.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 2858 tests executed *Failed tests:* {noformat} org.apache.hcatalog.hbase.snapshot.lock.TestWriteLock.testRun org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/437/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/437/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. Beeline usage is printed twice when beeline --help is executed Key: HIVE-5082 URL: https://issues.apache.org/jira/browse/HIVE-5082 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.12.0 Attachments: HIVE-5082.patch {code} bin/beeline --help /home/xzhang/apa/hive/build/dist/bin/hive: line 189: [: : integer expression expected Listening for transport dt_socket at address: 8000 Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv/tsv] format mode for result display --isolation=LEVEL set the transaction isolation level --help display this message Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as -d driver class the driver class to use -e query query that should be executed -f file script file that should be executed --color=[true/false]control whether color is used for display --showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern]format numbers using DecimalFormat pattern --force=[true/false]continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences
[jira] [Commented] (HIVE-2608) Do not require AS a,b,c part in LATERAL VIEW
[ https://issues.apache.org/jira/browse/HIVE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740095#comment-13740095 ] Edward Capriolo commented on HIVE-2608: --- +1 Do not require AS a,b,c part in LATERAL VIEW Key: HIVE-2608 URL: https://issues.apache.org/jira/browse/HIVE-2608 Project: Hive Issue Type: Improvement Components: Query Processor, UDF Reporter: Igor Kabiljo Assignee: Navis Priority: Minor Attachments: HIVE-2608.8.patch.txt, HIVE-2608.D4317.5.patch, HIVE-2608.D4317.6.patch, HIVE-2608.D4317.7.patch, HIVE-2608.D4317.8.patch Currently, it is required to state column names when LATERAL VIEW is used. That shouldn't be necessary, since UDTF returns struct which contains column names - and they should be used by default. For example, it would be great if this was possible: SELECT t.*, t.key1 + t.key4 FROM some_table LATERAL VIEW JSON_TUPLE(json, 'key1', 'key2', 'key3', 'key3') t; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4844) Add char/varchar data types
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4844: - Description: Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. NO PRECOMMIT TESTS was:Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Add char/varchar data types --- Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.1.patch.hack Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4601) WebHCat, Templeton need to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-4601: - Status: Open (was: Patch Available) WebHCat, Templeton need to support proxy users -- Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: gateay, proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira