[jira] [Commented] (HIVE-4646) skewjoin.q is failing in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675656#comment-13675656 ] Hudson commented on HIVE-4646: -- Integrated in Hive-trunk-h0.21 #2128 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2128/]) HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) (Revision 1489441) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489441 Files : * /hive/trunk/hcatalog/build.xml * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java skewjoin.q is failing in hadoop2 Key: HIVE-4646 URL: https://issues.apache.org/jira/browse/HIVE-4646 Project: Hive Issue Type: Test Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: HIVE-4646.D11043.1.patch https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2670) A cluster test utility for Hive
[ https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675655#comment-13675655 ] Hudson commented on HIVE-2670: -- Integrated in Hive-trunk-h0.21 #2128 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2128/]) HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via gates) (Revision 1489376) Result = FAILURE gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489376 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml * /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tools/test/floatpostprocessor.pl A cluster test utility for Hive --- Key: HIVE-2670 URL: https://issues.apache.org/jira/browse/HIVE-2670 Project: Hive Issue Type: New Feature Components: Testing Infrastructure Reporter: Alan Gates Assignee: Johnny Zhang Fix For: 0.12.0 Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, hive_cluster_test_2.patch, hive_cluster_test_3.patch, hive_cluster_test_4.patch, hive_cluster_test.patch Hive has an extensive set of unit tests, but it does not have an infrastructure for testing in a cluster environment. Pig and HCatalog have been using a test harness for cluster testing for some time. We have written Hive drivers and tests to run in this harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation
[ https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675653#comment-13675653 ] Hudson commented on HIVE-4546: -- Integrated in Hive-trunk-h0.21 #2128 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2128/]) HIVE-4546 : Hive CLI leaves behind the per session resource directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) (Revision 1489431) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489431 Files : * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java Hive CLI leaves behind the per session resource directory on non-interactive invocation --- Key: HIVE-4546 URL: https://issues.apache.org/jira/browse/HIVE-4546 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch As part of HIVE-4505, the resource directory is set to /tmp/${hive.session.id}_resources and suppose to be removed at the end. The CLI fails to remove it when invoked using -f or -e (non-interactive mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4377) Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
[ https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675654#comment-13675654 ] Hudson commented on HIVE-4377: -- Integrated in Hive-trunk-h0.21 #2128 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2128/]) HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 (HIVE2340) : (Navis via Ashutosh Chauhan) (Revision 1489436) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489436 Files : * /hive/trunk/hcatalog/build.xml * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java * /hive/trunk/ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q * /hive/trunk/ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340) -- Key: HIVE-4377 URL: https://issues.apache.org/jira/browse/HIVE-4377 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Navis Fix For: 0.12.0 Attachments: HIVE-4377.D10377.1.patch, HIVE-4377.D10377.2.patch, HIVE-4377.D10377.3.patch thanks a lot for addressing optimization in HIVE-2340. Awesome! Since we are developing at a very fast pace, it would be really useful to think about maintainability and testing of the large codebase. Highlights which are applicable for D1209: 1. Javadoc for all public/private functions, except for setters/getters. For any complex function, clear examples (input/output) would really help. 2. Specially, for query optimizations, it might be a good idea to have a simple working query at the top, and the expected changes. For e.g.. The operator tree for that query at each step, or a detailed explanation at the top. 3. If possible, the test name (.q file) where the function is being invoked, or the query which would potentially test that scenario, if it is a query processor change. 4. Comments in each test (.q file) that should include the jira number, what is it trying to test. Assumptions about each query. 5. Reduce the output for each test whenever query is outputting more than 10 results, it should have a reason. Otherwise, each query result should be bounded by 10 rows. thanks a lot -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 2128 - Still Failing
Changes for Build #2103 [daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows Changes for Build #2104 [daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness Changes for Build #2105 [omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions (Gunther Hagleitner via omalley) [omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther Hagleitner via omalley) Changes for Build #2106 Changes for Build #2107 [omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there are many partitions (Gopal V via omalley) Changes for Build #2108 Changes for Build #2109 Changes for Build #2110 Changes for Build #2111 [omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther Hagleitner via omalley) [omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther Hagleitner via omalley) Changes for Build #2112 Changes for Build #2113 [gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) Changes for Build #2114 [gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table formatting (gates) Changes for Build #2115 Changes for Build #2116 [navis] JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL (Richard Ding via Navis) Changes for Build #2117 Changes for Build #2118 Changes for Build #2119 Changes for Build #2120 Changes for Build #2121 [navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap (Yin Huai via Navis) [navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true (Gunther Hagleitner via Navis) Changes for Build #2122 Changes for Build #2123 Changes for Build #2124 [gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty Leverenz via gates) Changes for Build #2125 [daijy] PIG-3337: Fix remaining Window e2e tests Changes for Build #2126 [hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically by a SerDe (Gabriel Reid via Ashutosh Chauhan) [hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via Ashutosh Chauhan) [hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters (Chu Tong via Ashutosh Chauhan) [hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via Ashutosh Chauhan) [hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) [hashutosh] HIVE-4489 : beeline always return the same error message twice (Chaoyu Tang via Ashutosh Chauhan) [hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas Nair via Ashutosh Chauhan) [hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via Ashutosh Chauhan) Changes for Build #2127 [hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds is misleading (Thejas Nair via Ashutosh Chauhan) [navis] HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) Changes for Build #2128 [hashutosh] HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 (HIVE2340) : (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4546 : Hive CLI leaves behind the per session resource directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) [gates] HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via gates) All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2128) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2128/ to view the results.
[jira] [Created] (HIVE-4660) Let there be Tez (aka mrr ftw)
Gunther Hagleitner created HIVE-4660: Summary: Let there be Tez (aka mrr ftw) Key: HIVE-4660 URL: https://issues.apache.org/jira/browse/HIVE-4660 Project: Hive Issue Type: New Feature Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HiveonTez.pdf Tez is a new application framework built on Hadoop Yarn that can execute complex directed acyclic graphs of general data processing tasks. Here's the project's page: http://incubator.apache.org/projects/tez.html The interesting thing about Tez from Hive's perspective is that it will over time allow us to overcome inefficiencies in query processing due to having to express every algorithm in the map-reduce paradigm. The barrier to entry is pretty low as well: Tez can actually run unmodified MR jobs; But as a first step we can without much trouble start using more of Tez' features by taking advantage of the MRR pattern. MRR simply means that there can be any number of reduce stages following a single map stage - without having to write intermediate results to HDFS and re-read them in a new job. This is common when queries require multiple shuffles on keys without correlation (e.g.: join - grp by - window function - order by) For more details see the attached design doc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4660) Let there be Tez (aka mrr ftw)
[ https://issues.apache.org/jira/browse/HIVE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4660: - Attachment: HiveonTez.pdf Let there be Tez (aka mrr ftw) -- Key: HIVE-4660 URL: https://issues.apache.org/jira/browse/HIVE-4660 Project: Hive Issue Type: New Feature Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HiveonTez.pdf Tez is a new application framework built on Hadoop Yarn that can execute complex directed acyclic graphs of general data processing tasks. Here's the project's page: http://incubator.apache.org/projects/tez.html The interesting thing about Tez from Hive's perspective is that it will over time allow us to overcome inefficiencies in query processing due to having to express every algorithm in the map-reduce paradigm. The barrier to entry is pretty low as well: Tez can actually run unmodified MR jobs; But as a first step we can without much trouble start using more of Tez' features by taking advantage of the MRR pattern. MRR simply means that there can be any number of reduce stages following a single map stage - without having to write intermediate results to HDFS and re-read them in a new job. This is common when queries require multiple shuffles on keys without correlation (e.g.: join - grp by - window function - order by) For more details see the attached design doc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675698#comment-13675698 ] Jaideep Dhok commented on HIVE-4569: Update on the work done so far - # h5. Added getQueryPlan API with Thrift # h5. Added support for non-blocking queries. ## Right now I have done this by passing a boolean flag while calling executeStatement ## If the flag is set to true, query runs in non-blocking mode. The flag defaults to false. ## I've implemented this by adding a fixed size thread pool in the OperationManager, for running non-blocking operations. A reference to the future is kept in the operation, so that it can be cancelled. ## Once the query is running in the background, users can poll status using GetOperationStatus. ## Users can cancel the query by calling CancelOperation # h5. Additions in GetOperationStatus ## OperationManager calls operation.getTaskStatuses(), Each operation can override this method to customize reporting ## SQLOperation returns the task statuses by calling getTaskStatuses() on the current driver. ## Driver reports task statuses by iterating through all tasks in the plan ## Changes in HS2 thrift API - {code} // GetOperationStatus() // // Get the status of an operation running on the server. struct TGetOperationStatusReq { // Session to run this request against 1: required TOperationHandle operationHandle } // State of a sub task in an operation enum TTaskState { // The task has been initialized INITIALIZED_STATE, // Driver is currently running the task RUNNING_STATE, // Task is completed FINISHED_STATE, // Task is queued in the driver QUEUED_STATE, // State is unkown UNKOWN_STATE } // Status of a sub task in an operation struct TTaskStatus { // Task ID 1: required string taskId // External ID for this task, For example MapRedTask can return job ID of the Hadoop job 2: optional string externalHandle // Current state of the task as seen by driver 3: required TTaskState state } struct TGetOperationStatusResp { 1: required TStatus status // State of the whole operation 2: optional TOperationState operationState // List of statuses of sub tasks 3: optional listTTaskStatus taskStatuses } {code} h5. Things pending as of now # If the Task runs in a sub-process, then external handle (job ID) is returned as null. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4661) Unable to wrap analytical function in another function
Frans Drijver created HIVE-4661: --- Summary: Unable to wrap analytical function in another function Key: HIVE-4661 URL: https://issues.apache.org/jira/browse/HIVE-4661 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Frans Drijver I am unable to wrap an analytical function in another function as so: {quote} select case when ta_end_datetime_berekenen = 'Y' then lead(ta_update_datetime) over ( partition by dn_waarde_van, dn_waarde_tot order by ta_update_datetime ) else ea_end_datetime end as ea_end_datetime , ta_insert_datetime , ta_update_datetime from tmp_wtdh_bestedingsklasse_10_s2_stap2 {quote} This produces the following error: {quote} NoViableAltException(86@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?]) FAILED: ParseException line 1:175 missing KW_END at 'over' near ')' in selection target line 1:254 cannot recognize input near 'else' 'ea_end_datetime' 'end' in selection target {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4115) Introduce cube abstraction in hive
[ https://issues.apache.org/jira/browse/HIVE-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4115: -- Attachment: HIVE-4115.D10689.3.patch Amareshwari updated the revision HIVE-4115 [jira] Introduce cube abstraction in hive. - Fix AliasReplacer - Queries with starting of the month as start period should be considered for MONTHLY update period - Add validations for all the tests in TestCubeDriver Reviewers: JIRA, njain, alanfgates, omalley, cwsteinbach, ashutoshc REVISION DETAIL https://reviews.facebook.net/D10689 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10689?vs=34029id=34299#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/Driver.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/AbstractCubeTable.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/BaseDimension.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ColumnMeasure.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Cube.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimension.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeDimensionTable.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeFactTable.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMeasure.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeMetastoreClient.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/CubeTableType.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ExprMeasure.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HDFSStorage.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/HierarchicalDimension.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/InlineDimension.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreConstants.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/MetastoreUtil.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Named.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/ReferencedDimension.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/Storage.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/StorageConstants.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/TableReference.java ql/src/java/org/apache/hadoop/hive/ql/cube/metadata/UpdatePeriod.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AggregateResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/AliasReplacer.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckColumnMapping.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckDateRange.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CheckTableNames.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ContextRewriter.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryConstants.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryContext.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryExpr.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeQueryRewriter.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/CubeSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/DateUtil.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/GroupbyResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/HQLParser.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/JoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastDimensionResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LeastPartitionResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/LightestFactResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/PartitionResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/StorageTableResolver.java ql/src/java/org/apache/hadoop/hive/ql/cube/parse/ValidationRule.java ql/src/java/org/apache/hadoop/hive/ql/cube/processors/CubeDriver.java ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java ql/src/test/org/apache/hadoop/hive/ql/cube/metadata/TestCubeMetastoreClient.java ql/src/test/org/apache/hadoop/hive/ql/cube/parse/CubeTestSetup.java ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestCubeSemanticAnalyzer.java ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestDateUtil.java ql/src/test/org/apache/hadoop/hive/ql/cube/parse/TestMaxUpdateInterval.java ql/src/test/org/apache/hadoop/hive/ql/cube/processors/TestCubeDriver.java To: JIRA, njain, alanfgates, omalley, cwsteinbach, ashutoshc, Amareshwari Introduce cube abstraction in hive -- Key: HIVE-4115 URL: https://issues.apache.org/jira/browse/HIVE-4115 Project: Hive Issue Type: New Feature Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Attachments:
[jira] [Created] (HIVE-4662) first_value can't have more than one order by column
Frans Drijver created HIVE-4662: --- Summary: first_value can't have more than one order by column Key: HIVE-4662 URL: https://issues.apache.org/jira/browse/HIVE-4662 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Frans Drijver In the current implementation of the first_value function, it's not allowed to have more than one (1) order by column, as so: {quote} select distinct first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, rettr.DEVPDNR ) from RTAVP_DRKASTR kastr ; {quote} Error given: {quote} FAILED: SemanticException Range based Window Frame can have only 1 Sort Key {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-4659 while sql contains \t ? 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11652/ --- Review request for hive. Description --- https://issues.apache.org/jira/browse/HIVE-4659 This addresses bug HIVE-4659. https://issues.apache.org/jira/browse/HIVE-4659 Diffs - http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1489269 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java 1489269 Diff: https://reviews.apache.org/r/11652/diff/ Testing --- $ hive -e show create table v_test_1 CREATE VIEW v_test_1 AS select key, value, dt from ( select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' union all select `tmp_v_t1`.`key`, split(`tmp_v_t1`.`value`,'\\\t')[0] as `value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123' ) `t` ; Screenshots --- Example View https://reviews.apache.org/r/11652/s/27/ Thanks, fangkun cao
[jira] [Updated] (HIVE-4659) while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results
[ https://issues.apache.org/jira/browse/HIVE-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4659: - Attachment: HIVE-4659-1.patch https://reviews.apache.org/r/11652/ while sql contains \t , 'desc formatted view_name' and 'show create table view_name' statements will generate Incomplete results Key: HIVE-4659 URL: https://issues.apache.org/jira/browse/HIVE-4659 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4659-1.patch drop view if exists v_test; CREATE VIEW v_test AS select key,-- start by \t\t value, -- start by \t\t dt from -- start by \t\t ( select key, value, dt from tmp_v_t1 where dt='20130122' union all select key,value, dt from tmp_v_t1 where dt='20130123' ) t; $ hive -e show create table v_test UT-One the three lines which started by \t lost in create statment ! Logging initialized using configuration in file:/home/zongren/hive-conf/hive-log4j.properties Hive history file=/tmp/zongren/hive_job_log_zongren_24155@hd17-vm5_201306051125_94165790.txt OK CREATE VIEW v_test AS select ( select `tmp_v_t1`.`key`, `tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130122' union all select `tmp_v_t1`.`key`,`tmp_v_t1`.`value`, `tmp_v_t1`.`dt` from `default`.`tmp_v_t1` where `tmp_v_t1`.`dt`='20130123' ) `t` Time taken: 2.767 seconds, Fetched: 9 row(s) UT-Two: -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4346) when writing data into filesystem from queries ,the output files could contain a line of column names
[ https://issues.apache.org/jira/browse/HIVE-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun reassigned HIVE-4346: Assignee: caofangkun when writing data into filesystem from queries ,the output files could contain a line of column names -- Key: HIVE-4346 URL: https://issues.apache.org/jira/browse/HIVE-4346 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4346-1.patch, HIVE-4346-3.patch For example : hivedesc src; key string value string hiveselect * from src; 1 10 2 20 hiveset hive.output.markschema=true; hiveinsert overwrite local directory './test1' select * from src ; hive!ls -l './test1'; ./test1/_metadata ./test1/00_0 hive!cat './test1/_metadata' key^Avalue hive!cat './test1/00_0'; 1^A10 2^A20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4367) enhance TRUNCATE syntex to drop data of external table
[ https://issues.apache.org/jira/browse/HIVE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun reassigned HIVE-4367: Assignee: caofangkun enhance TRUNCATE syntex to drop data of external table Key: HIVE-4367 URL: https://issues.apache.org/jira/browse/HIVE-4367 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4367-1.patch In my use case , sometimes I have to remove data of external tables to free up storage space of the cluster . So it's necessary for to enhance the syntax like TRUNCATE TABLE srcpart_truncate PARTITION (dt='201130412') FORCE; to remove data from EXTERNAL table. And I add a configuration property to enable remove data to Trash property namehive.truncate.skiptrash/name valuefalse/value description if true will remove data to trash, else false drop data immediately /description /property For example : hive (default) TRUNCATE TABLE external1 partition (ds='11'); FAILED: Error in semantic analysis: Cannot truncate non-managed table external1 hive (default) TRUNCATE TABLE external1 partition (ds='11') FORCE; [2013-04-16 17:15:52]: Compile Start [2013-04-16 17:15:52]: Compile End [2013-04-16 17:15:52]: OK [2013-04-16 17:15:52]: Time taken: 0.413 seconds hive (default) set hive.truncate.skiptrash; hive.truncate.skiptrash=false hive (default) set hive.truncate.skiptrash=true; hive (default) TRUNCATE TABLE external1 partition (ds='12') FORCE; [2013-04-16 17:16:21]: Compile Start [2013-04-16 17:16:21]: Compile End [2013-04-16 17:16:21]: OK [2013-04-16 17:16:21]: Time taken: 0.143 seconds hive (default) dfs -ls /user/test/.Trash/Current/; Found 1 items drwxr-xr-x -test supergroup 0 2013-04-16 17:06 /user/test/.Trash/Current/ds=11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: error in running the hive test cases
check if hadoop-test-*.jar is in the classpath 2013/6/4 ur lops urlop...@gmail.com Hi, When I run the hive test case, I keep getting the following error: [echo] Project: serde [javac] Compiling 36 source files to /home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes [javac] TestAvroSerdeUtils.java:24: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: package org.apache.hadoop.hdfs [javac] import org.apache.hadoop.hdfs.MiniDFSCluster; [javac] ^ [javac] TestAvroSerdeUtils.java:184: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] MiniDFSCluster miniDfs = null; [javac] ^ [javac] TestAvroSerdeUtils.java:187: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] miniDfs = new MiniDFSCluster(new Configuration(), 1, true, null); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. I am building hive 0.9 and running the test using ant package test. can someone give me a pointer, which jar is missing from classpath and how to resolve it. Thanks -- Best wishs! Fangkun.Cao
[jira] [Updated] (HIVE-4662) first_value can't have more than one order by column
[ https://issues.apache.org/jira/browse/HIVE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frans Drijver updated HIVE-4662: Description: In the current implementation of the first_value function, it's not allowed to have more than one (1) order by column, as so: {quote} select distinct first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, kastr.DEVPDNR ) from RTAVP_DRKASTR kastr ; {quote} Error given: {quote} FAILED: SemanticException Range based Window Frame can have only 1 Sort Key {quote} was: In the current implementation of the first_value function, it's not allowed to have more than one (1) order by column, as so: {quote} select distinct first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, rettr.DEVPDNR ) from RTAVP_DRKASTR kastr ; {quote} Error given: {quote} FAILED: SemanticException Range based Window Frame can have only 1 Sort Key {quote} first_value can't have more than one order by column Key: HIVE-4662 URL: https://issues.apache.org/jira/browse/HIVE-4662 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Frans Drijver In the current implementation of the first_value function, it's not allowed to have more than one (1) order by column, as so: {quote} select distinct first_value(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, kastr.DEVPDNR ) from RTAVP_DRKASTR kastr ; {quote} Error given: {quote} FAILED: SemanticException Range based Window Frame can have only 1 Sort Key {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4663) Needlessly adding analytical windowing columns to my select
Frans Drijver created HIVE-4663: --- Summary: Needlessly adding analytical windowing columns to my select Key: HIVE-4663 URL: https://issues.apache.org/jira/browse/HIVE-4663 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Frans Drijver Forgive the rather cryptic title, but I was unsure what the best summary would be. The situation is as follows: If I have query in which I do both a select of a 'normal' column and an analytical function, as so: {quote} select distinct kastr.DELOGCE , lag(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, kastr.DEVPDNR ) from RTAVP_DRKASTR kastr ; {quote} I get the following error: {quote} FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: org.apache.hadoop.hive.ql.parse.SemanticException: Line 3:41 Expression not in GROUP BY key 'DEKTRNR' {quote} The way around is to also put the analytical windowing columns in my select, as such: {quote} select distinct kastr.DELOGCE , lag(kastr.DEWNKNR) over ( partition by kastr.DEKTRNR order by kastr.DETRADT, kastr.DEVPDNR ) , kastr.DEKTRNR , kastr.DEWNKNR , kastr.DETRADT , kastr.DEVPDNR from RTAVP_DRKASTR kastr ; {quote} Obviously this is generally unwanted behaviour, as it can widen the select significantly -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4664) Support Hive specific DISTRIBUTE BY clause in VectorGroupByOperator
Remus Rusanu created HIVE-4664: -- Summary: Support Hive specific DISTRIBUTE BY clause in VectorGroupByOperator Key: HIVE-4664 URL: https://issues.apache.org/jira/browse/HIVE-4664 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuoluo (Clark) Yang updated HIVE-4561: --- Attachment: HIVE-4561.4.patch Update patch, make HIGH/LOW values of empty tables return null. Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11172/ --- (Updated June 5, 2013, 2:06 p.m.) Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, Shreepadma Venugopalan, and fangkun cao. Changes --- Like GenericUDAFMax/GenericUDAFMin, it returns null for high/low value. Description --- An initialization error. Make double and long initialize correctly. Would you review that and assign the issue to me? This addresses bug HIVE-4561. https://issues.apache.org/jira/browse/HIVE-4561 Diffs (updated) - http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 1489292 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 1489292 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out 1489292 Diff: https://reviews.apache.org/r/11172/diff/ Testing --- ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q done. Thanks, Zhuoluo Yang
[jira] [Updated] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuoluo (Clark) Yang updated HIVE-4561: --- Status: Patch Available (was: Open) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11172/#review21480 --- Ship it! +1 - Ashutosh Chauhan On June 5, 2013, 2:06 p.m., Zhuoluo Yang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11172/ --- (Updated June 5, 2013, 2:06 p.m.) Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, Shreepadma Venugopalan, and fangkun cao. Description --- An initialization error. Make double and long initialize correctly. Would you review that and assign the issue to me? This addresses bug HIVE-4561. https://issues.apache.org/jira/browse/HIVE-4561 Diffs - http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 1489292 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 1489292 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out 1489292 Diff: https://reviews.apache.org/r/11172/diff/ Testing --- ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q done. Thanks, Zhuoluo Yang
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4435: --- Affects Version/s: 0.11.0 Status: Open (was: Patch Available) Canceling patch since current patch is resulting in test failures. Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.11.0, 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4568) Beeline needs to support resolving variables
[ https://issues.apache.org/jira/browse/HIVE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675964#comment-13675964 ] Ashutosh Chauhan commented on HIVE-4568: [~cwsteinbach] Do you have any further comments on the patch? Beeline needs to support resolving variables Key: HIVE-4568 URL: https://issues.apache.org/jira/browse/HIVE-4568 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.11.1 Attachments: HIVE-4568.patch Beeline currently doesn't support variable (system, env, etc) substitution as hive client does. Supporting this feature will certainly make it more usable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4355) HCatalog test TestPigHCatUtil might fail on JDK7
[ https://issues.apache.org/jira/browse/HIVE-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675969#comment-13675969 ] Ashutosh Chauhan commented on HIVE-4355: bq. I’ve seen TestPigHCatUtil failing because the order of method calls was different then when compiling and running the tests only on JDK 6 or only on JDK 7. Can you explain this bit more? From patch, its not obvious how it is solving the problem you have identified. HCatalog test TestPigHCatUtil might fail on JDK7 Key: HIVE-4355 URL: https://issues.apache.org/jira/browse/HIVE-4355 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Attachments: HIVE-4355.patch I’ve tried interesting scenario. I’ve compiled hcatalog with JDK 6 (including tests) and run the tests itself on JDK 7. My motivation was to see what will happen to users that will download official Apache release (usually compiled on JDK 6) and will run it on JDK 7. I’ve seen {{TestPigHCatUtil}} failing because the order of method calls was different then when compiling and running the tests only on JDK 6 or only on JDK 7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable
[ https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675971#comment-13675971 ] Ashutosh Chauhan commented on HIVE-4459: +1 Script hcat is overriding HIVE_CONF_DIR variable Key: HIVE-4459 URL: https://issues.apache.org/jira/browse/HIVE-4459 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Priority: Minor Fix For: 0.12.0 Attachments: bugHIVE-4459.patch Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the variable if it was set by the user. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675985#comment-13675985 ] Ashutosh Chauhan commented on HIVE-4554: TestMinimrCliDriver.schemeAuthority.q fails with exception {{ mkdir: cannot create directory hdfs:///tmp/test: File exists }} I think if you modify last line of your test to do dfs -rmr hdfs:///tmp/test that should be sufficient. Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3, HIVE-4554.patch.4 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4390) Enable capturing input URI entities for DML statements
[ https://issues.apache.org/jira/browse/HIVE-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675987#comment-13675987 ] Ashutosh Chauhan commented on HIVE-4390: I didn't get the backward compatibility problem (and thus the need of config variable) here. Enable capturing input URI entities for DML statements -- Key: HIVE-4390 URL: https://issues.apache.org/jira/browse/HIVE-4390 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4390-2.patch The query compiler doesn't capture the files or directories accessed by following statements - * Load data * Export * Import * Alter table/partition set location This is very useful information to access from the hooks for monitoring/auditing etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold
[ https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676104#comment-13676104 ] Kevin Wilfong commented on HIVE-4324: - Sorry for the delay Owen. Are you concerned that there will be applications outside of Hive calling methods in OrcFile.java? If so I can add the backward compatible method. ORC Turn off dictionary encoding when number of distinct keys is greater than threshold --- Key: HIVE-4324 URL: https://issues.apache.org/jira/browse/HIVE-4324 Project: Hive Issue Type: Sub-task Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4324.1.patch.txt Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
Eric Hanson created HIVE-4665: - Summary: error at VectorExecMapper.close in group-by-agg query over ORC, vectorized Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Jitendra Nath Pandey CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException: org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator Continuing ... java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(VectorGroupByOperator); Continuing ... Starting Job = job_201306041757_0016, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job -kill job_201306041757_0016 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3 2013-06-05 10:03:06,022 Stage-1 map = 0%, reduce = 0% 2013-06-05 10:03:51,142 Stage-1 map = 100%, reduce = 100% Ended Job = job_201306041757_0016 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Examining task ID: task_201306041757_0016_m_09 (and more) from job job_201306041757_0016 Task with the most failures(4): - Task ID: task_201306041757_0016_m_00 URL: http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable StringObjectInspector.java:40) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:235) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:253)
[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4657: - Attachment: HIVE-4657.1.patch HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4657: - Status: Patch Available (was: Open) HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676190#comment-13676190 ] Shreepadma Venugopalan commented on HIVE-4657: -- This fixes the build which is currently broken. HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
Tony Murphy created HIVE-4666: - Summary: Count(*) over tpch lineitem ORC results in Error: Java heap space Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Murphy updated HIVE-4666: -- Description: Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. was: Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject
[ https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676299#comment-13676299 ] Hudson commented on HIVE-2304: -- Integrated in Hive-trunk-h0.21 #2129 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2129/]) HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh Chauhan) (Revision 1489704) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489704 Files : * /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HivePreparedStatement.java * /hive/trunk/jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java Support PreparedStatement.setObject --- Key: HIVE-2304 URL: https://issues.apache.org/jira/browse/HIVE-2304 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: 0.7.1 Reporter: Ido Hadanny Assignee: Ido Hadanny Priority: Minor Fix For: 0.12.0 Attachments: HIVE-0.8-SetObject.2.patch.txt Original Estimate: 1h Remaining Estimate: 1h PreparedStatement.setObject is important for spring's jdbcTemplate support -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established
[ https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676301#comment-13676301 ] Hudson commented on HIVE-4566: -- Integrated in Hive-trunk-h0.21 #2129 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2129/]) HIVE-4566 : NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established (Xuefu Zhang via Ashutosh Chauhan) (Revision 1489672) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489672 Files : * /hive/trunk/beeline/src/java/org/apache/hive/beeline/Commands.java * /hive/trunk/beeline/src/test/org/apache/hive/beeline/src/test/TestBeeLineWithArgs.java NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established - Key: HIVE-4566 URL: https://issues.apache.org/jira/browse/HIVE-4566 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4566.patch, HIVE-4566.patch.1 Before a DB connection is established, executing a command such as typeinfo and nativesql results an NPE shown at the console: beeline !typeinfo java.lang.NullPointerException beeline !nativesql java.lang.NullPointerException Instead, a message, such as No current connection should be given, as in case of some other commands, such as dropall. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded
[ https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676302#comment-13676302 ] Hudson commented on HIVE-4526: -- Integrated in Hive-trunk-h0.21 #2129 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2129/]) HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis via Ashutosh Chauhan) (Revision 1489703) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489703 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out auto_sortmerge_join_9.q throws NPE but test is succeeded Key: HIVE-4526 URL: https://issues.apache.org/jira/browse/HIVE-4526 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: HIVE-4526.D10725.1.patch auto_sortmerge_join_9.q {noformat} [junit] Running org.apache.hadoop.hive.cli.TestCliDriver [junit] Begin query: auto_sortmerge_join_9.q [junit] Deleted file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1 [junit] Deleted file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception nulljava.lang.NullPointerException [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252) [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception nulljava.lang.NullPointerException [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252) [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at
[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Murphy updated HIVE-4666: -- Attachment: output query output Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Murphy updated HIVE-4666: -- Description: Executing the following query over an orc tpch line item table fails due to Error: Java heap space { INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc } the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. was: Executing the following query over an orc tpch line item table fails due to Error: Java heap space INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space { INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc } the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 2129 - Still Failing
Changes for Build #2105 [omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions (Gunther Hagleitner via omalley) [omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther Hagleitner via omalley) Changes for Build #2106 Changes for Build #2107 [omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there are many partitions (Gopal V via omalley) Changes for Build #2108 Changes for Build #2109 Changes for Build #2110 Changes for Build #2111 [omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther Hagleitner via omalley) [omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther Hagleitner via omalley) Changes for Build #2112 Changes for Build #2113 [gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) Changes for Build #2114 [gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table formatting (gates) Changes for Build #2115 Changes for Build #2116 [navis] JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL (Richard Ding via Navis) Changes for Build #2117 Changes for Build #2118 Changes for Build #2119 Changes for Build #2120 Changes for Build #2121 [navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap (Yin Huai via Navis) [navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true (Gunther Hagleitner via Navis) Changes for Build #2122 Changes for Build #2123 Changes for Build #2124 [gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty Leverenz via gates) Changes for Build #2125 [daijy] PIG-3337: Fix remaining Window e2e tests Changes for Build #2126 [hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically by a SerDe (Gabriel Reid via Ashutosh Chauhan) [hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via Ashutosh Chauhan) [hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters (Chu Tong via Ashutosh Chauhan) [hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via Ashutosh Chauhan) [hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) [hashutosh] HIVE-4489 : beeline always return the same error message twice (Chaoyu Tang via Ashutosh Chauhan) [hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas Nair via Ashutosh Chauhan) [hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via Ashutosh Chauhan) Changes for Build #2127 [hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds is misleading (Thejas Nair via Ashutosh Chauhan) [navis] HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) Changes for Build #2128 [hashutosh] HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 (HIVE2340) : (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4546 : Hive CLI leaves behind the per session resource directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) [gates] HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via gates) Changes for Build #2129 [hashutosh] HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh Chauhan) [hashutosh] HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4516 : Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java (Jon Hartlaub and Navis via Ashutosh Chauhan) [hashutosh] HIVE-4566 : NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established (Xuefu Zhang via Ashutosh Chauhan) All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2129) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2129/ to view the results.
[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Murphy updated HIVE-4666: -- Description: Executing the following query over an orc tpch line item table fails due to Error: Java heap space {noformat} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {noformat} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. was: Executing the following query over an orc tpch line item table fails due to Error: Java heap space {noformat} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {noformat} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space {noformat} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {noformat} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Murphy updated HIVE-4666: -- Description: Executing the following query over an orc tpch line item table fails due to Error: Java heap space {quote} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {quote} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. was: Executing the following query over an orc tpch line item table fails due to Error: Java heap space {{ INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc }} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space {quote} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {quote} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #394
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/394/ -- [...truncated 35315 lines...] [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-06-05_13-24-53_990_6701023171783792339/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_1750664190.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/jenkins/hive_2013-06-05_13-24-57_945_2116625151976318778/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-06-05_13-24-57_945_2116625151976318778/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_1354410736.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_996382908.txt [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201306051324_809971567.txt [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK:
[jira] [Created] (HIVE-4667) tpch query 1 fails with java.lang.ClassCastException
Tony Murphy created HIVE-4667: - Summary: tpch query 1 fails with java.lang.ClassCastException Key: HIVE-4667 URL: https://issues.apache.org/jira/browse/HIVE-4667 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Fix For: vectorization-branch {noformat} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColSubtractLongScalar.evaluate(DoubleColSubtractLongScalar.java:46) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:69) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColMultiplyDoubleColumn.evaluate(DoubleColMultiplyDoubleColumn.java:41) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFSumDouble.aggregateInputSelection(VectorUDAFSumDouble.java:98) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processAggregators(VectorGroupByOperator.java:174) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:151) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:104) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:717) ... 9 more {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676365#comment-13676365 ] Shreepadma Venugopalan commented on HIVE-4435: -- [~ashutoshc]: I've updated the .q files in the patches. Thanks! Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0, 0.11.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Status: Patch Available (was: Open) Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.11.0, 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Attachment: HIVE-4435.2.patch Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0, 0.11.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch, HIVE-4435.2.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676367#comment-13676367 ] Shreepadma Venugopalan commented on HIVE-4641: -- Enforcing security on a per row basis could be one use of such a hook. The hook can be used in other ways to apply custom transformations to the result set before returning to the client. Support post execution/fetch hook for HiveServer2 - Key: HIVE-4641 URL: https://issues.apache.org/jira/browse/HIVE-4641 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Support post execution/fetch hook that is invoked prior to returning results to the client. This can be used to filter results to enforce a specific security policy before returning the result set to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4657: --- Assignee: Shreepadma Venugopalan HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676416#comment-13676416 ] Ashutosh Chauhan commented on HIVE-4657: +1 HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4668) wrong results for query with modulo (%) in WHERE clause filter
Eric Hanson created HIVE-4668: - Summary: wrong results for query with modulo (%) in WHERE clause filter Key: HIVE-4668 URL: https://issues.apache.org/jira/browse/HIVE-4668 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson select disinternalmsft16431, count(disinternalmsft16431) from factsqlengineam_vec_orc where ddate = 2012-12 and ddate 2013-02 and disinternalmsft16431 % 5 = 0 group by disinternalmsft16431 Expected result: 0 3160232 5 33039254 Actual result: 0 8697033 6 2706407 5 94709959 There should be no result row for 6 because 6 % 5 != 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676441#comment-13676441 ] Ashutosh Chauhan commented on HIVE-2206: In your testcases, some of the patterns you have (e.g., like Join followed by GBY) on same keys, I assume reducesink reduplication optimization will already take care of it such that it will generate only 1 MR job. Is that correct? Is it that for all of your testcases reducesink dedup optimization will not fire. If its former, than it will be good to identify which of those cases are already taken care by RS dedup. If its latter, than it will be good to know why reducesink dedup optimization is not kicking in for those. add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.10-r1384442.patch.txt, HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, HIVE-2206.D11097.1.patch, testQueries.2.q, YSmartPatchForHive.patch This issue proposes a new logical optimizer called Correlation Optimizer, which is used to merge correlated MapReduce jobs (MR jobs) into a single MR job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The paper and slides of YSmart are linked at the bottom. Since Hive translates queries in a sentence by sentence fashion, for every operation which may need to shuffle the data (e.g. join and aggregation operations), Hive will generate a MapReduce job for that operation. However, for those operations which may need to shuffle the data, they may involve correlations explained below and thus can be executed in a single MR job. # Input Correlation: Multiple MR jobs have input correlation (IC) if their input relation sets are not disjoint; # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they have not only input correlation, but also the same partition key; # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its child nodes if it has the same partition key as that child node. The current implementation of correlation optimizer only detect correlations among MR jobs for reduce-side join operators and reduce-side aggregation operators (not map only aggregation). A query will be optimized if it satisfies following conditions. # There exists a MR job for reduce-side join operator or reduce side aggregation operator which have JFC with all of its parents MR jobs (TCs will be also exploited if JFC exists); # All input tables of those correlated MR job are original input tables (not intermediate tables generated by sub-queries); and # No self join is involved in those correlated MR jobs. Correlation optimizer is implemented as a logical optimizer. The main reasons are that it only needs to manipulate the query plan tree and it can leverage the existing component on generating MR jobs. Current implementation can serve as a framework for correlation related optimizations. I think that it is better than adding individual optimizers. There are several work that can be done in future to improve this optimizer. Here are three examples. # Support queries only involve TC; # Support queries in which input tables of correlated MR jobs involves intermediate tables; and # Optimize queries involving self join. References: Paper and presentation of YSmart. Paper: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf Slides: http://sdrv.ms/UpwJJc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam reassigned HIVE-3159: --- Assignee: Mohammad Kamrul Islam Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
[ https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676452#comment-13676452 ] Eric Hanson commented on HIVE-4665: --- Similar error occurs for this query: select avg(disinternalmsft16431) from factsqlengineam_vec_orc; Error: Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be cast to org.apache.hadoop.hive.serde2.io.Doub leWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.j ava:35) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:534) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196) ... 8 more error at VectorExecMapper.close in group-by-agg query over ORC, vectorized -- Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Jitendra Nath Pandey CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException:
[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
[ https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4665: -- Assignee: (was: Jitendra Nath Pandey) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized -- Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException: org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator Continuing ... java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(VectorGroupByOperator); Continuing ... Starting Job = job_201306041757_0016, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job -kill job_201306041757_0016 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3 2013-06-05 10:03:06,022 Stage-1 map = 0%, reduce = 0% 2013-06-05 10:03:51,142 Stage-1 map = 100%, reduce = 100% Ended Job = job_201306041757_0016 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Examining task ID: task_201306041757_0016_m_09 (and more) from job job_201306041757_0016 Task with the most failures(4): - Task ID: task_201306041757_0016_m_00 URL: http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable StringObjectInspector.java:40) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:235)
[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4554: -- Attachment: HIVE-4554.patch.5 Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676462#comment-13676462 ] Yin Huai commented on HIVE-2206: RS dedup is on by default. So the explain without CorrelationOptimizer should be optimized by RS dedup. But, seems that it does not fire in any of my cases. Will take a look at it later. add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.10-r1384442.patch.txt, HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, HIVE-2206.D11097.1.patch, testQueries.2.q, YSmartPatchForHive.patch This issue proposes a new logical optimizer called Correlation Optimizer, which is used to merge correlated MapReduce jobs (MR jobs) into a single MR job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The paper and slides of YSmart are linked at the bottom. Since Hive translates queries in a sentence by sentence fashion, for every operation which may need to shuffle the data (e.g. join and aggregation operations), Hive will generate a MapReduce job for that operation. However, for those operations which may need to shuffle the data, they may involve correlations explained below and thus can be executed in a single MR job. # Input Correlation: Multiple MR jobs have input correlation (IC) if their input relation sets are not disjoint; # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they have not only input correlation, but also the same partition key; # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its child nodes if it has the same partition key as that child node. The current implementation of correlation optimizer only detect correlations among MR jobs for reduce-side join operators and reduce-side aggregation operators (not map only aggregation). A query will be optimized if it satisfies following conditions. # There exists a MR job for reduce-side join operator or reduce side aggregation operator which have JFC with all of its parents MR jobs (TCs will be also exploited if JFC exists); # All input tables of those correlated MR job are original input tables (not intermediate tables generated by sub-queries); and # No self join is involved in those correlated MR jobs. Correlation optimizer is implemented as a logical optimizer. The main reasons are that it only needs to manipulate the query plan tree and it can leverage the existing component on generating MR jobs. Current implementation can serve as a framework for correlation related optimizations. I think that it is better than adding individual optimizers. There are several work that can be done in future to improve this optimizer. Here are three examples. # Support queries only involve TC; # Support queries in which input tables of correlated MR jobs involves intermediate tables; and # Optimize queries involving self join. References: Paper and presentation of YSmart. Paper: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf Slides: http://sdrv.ms/UpwJJc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[ https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676471#comment-13676471 ] Ashutosh Chauhan commented on HIVE-4348: [~shuainie] I assume you have also tested this on linux. Is that correct ? Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character -- Key: HIVE-4348 URL: https://issues.apache.org/jira/browse/HIVE-4348 Project: Hive Issue Type: Bug Components: HBase Handler, Testing Infrastructure, Windows Affects Versions: 0.11.0 Environment: Windows 8 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4348.patch Original Estimate: 24h Remaining Estimate: 24h The problem is because the automatically generated test case hardcoded file path string of query file using \ instead of \\ as escape character. The change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4526) auto_sortmerge_join_9.q throws NPE but test is succeeded
[ https://issues.apache.org/jira/browse/HIVE-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676478#comment-13676478 ] Hudson commented on HIVE-4526: -- Integrated in Hive-trunk-hadoop2 #226 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/226/]) HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis via Ashutosh Chauhan) (Revision 1489703) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489703 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out auto_sortmerge_join_9.q throws NPE but test is succeeded Key: HIVE-4526 URL: https://issues.apache.org/jira/browse/HIVE-4526 Project: Hive Issue Type: Test Components: Tests Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: HIVE-4526.D10725.1.patch auto_sortmerge_join_9.q {noformat} [junit] Running org.apache.hadoop.hive.cli.TestCliDriver [junit] Begin query: auto_sortmerge_join_9.q [junit] Deleted file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl1 [junit] Deleted file:/home/navis/apache/oss-hive/build/ql/test/data/warehouse/tbl2 [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception nulljava.lang.NullPointerException [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252) [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception nulljava.lang.NullPointerException [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromPartitionedTable(FetchOperator.java:252) [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:605) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:156) [junit] [junit] at org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631) [junit] at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:393) [junit] at
[jira] [Commented] (HIVE-2304) Support PreparedStatement.setObject
[ https://issues.apache.org/jira/browse/HIVE-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676474#comment-13676474 ] Hudson commented on HIVE-2304: -- Integrated in Hive-trunk-hadoop2 #226 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/226/]) HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh Chauhan) (Revision 1489704) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489704 Files : * /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HivePreparedStatement.java * /hive/trunk/jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java Support PreparedStatement.setObject --- Key: HIVE-2304 URL: https://issues.apache.org/jira/browse/HIVE-2304 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: 0.7.1 Reporter: Ido Hadanny Assignee: Ido Hadanny Priority: Minor Fix For: 0.12.0 Attachments: HIVE-0.8-SetObject.2.patch.txt Original Estimate: 1h Remaining Estimate: 1h PreparedStatement.setObject is important for spring's jdbcTemplate support -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4646) skewjoin.q is failing in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676475#comment-13676475 ] Hudson commented on HIVE-4646: -- Integrated in Hive-trunk-hadoop2 #226 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/226/]) HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) (Revision 1489441) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489441 Files : * /hive/trunk/hcatalog/build.xml * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java skewjoin.q is failing in hadoop2 Key: HIVE-4646 URL: https://issues.apache.org/jira/browse/HIVE-4646 Project: Hive Issue Type: Test Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: HIVE-4646.D11043.1.patch https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation
[ https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676472#comment-13676472 ] Hudson commented on HIVE-4546: -- Integrated in Hive-trunk-hadoop2 #226 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/226/]) HIVE-4546 : Hive CLI leaves behind the per session resource directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) (Revision 1489431) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489431 Files : * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java Hive CLI leaves behind the per session resource directory on non-interactive invocation --- Key: HIVE-4546 URL: https://issues.apache.org/jira/browse/HIVE-4546 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch As part of HIVE-4505, the resource directory is set to /tmp/${hive.session.id}_resources and suppose to be removed at the end. The CLI fails to remove it when invoked using -f or -e (non-interactive mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676521#comment-13676521 ] Shreepadma Venugopalan commented on HIVE-4641: -- This is a general purpose hook and is not specific to any feature. Hive has hooks at various stages of compilation and execution - pre semantic analysis, post semantic analysis, pre execution etc, but misses a post execution/post fetch hook. This JIRA just adds that. Support post execution/fetch hook for HiveServer2 - Key: HIVE-4641 URL: https://issues.apache.org/jira/browse/HIVE-4641 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Support post execution/fetch hook that is invoked prior to returning results to the client. This can be used to filter results before returning the result set to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4654) Remove unused org.apache.hadoop.hive.ql.exec Writables
[ https://issues.apache.org/jira/browse/HIVE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676529#comment-13676529 ] Eric Hanson commented on HIVE-4654: --- This appears related to some functional bugs. See https://issues.apache.org/jira/browse/HIVE-4665. Remove unused org.apache.hadoop.hive.ql.exec Writables -- Key: HIVE-4654 URL: https://issues.apache.org/jira/browse/HIVE-4654 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Priority: Minor The Writables are originally from org.apache.hadoop.io. I tend to assume that they have been re-defined in hive if the original implementation was not considered good enough. However, I don't understand why some are defined twice in hive itself. I noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. The ByteWritable in serde2.io is being referred to in bunch of places. Therefore, I would suggest to just use the one in serde2.io. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
[ https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4665: -- Assignee: Eric Hanson error at VectorExecMapper.close in group-by-agg query over ORC, vectorized -- Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException: org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator Continuing ... java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(VectorGroupByOperator); Continuing ... Starting Job = job_201306041757_0016, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job -kill job_201306041757_0016 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3 2013-06-05 10:03:06,022 Stage-1 map = 0%, reduce = 0% 2013-06-05 10:03:51,142 Stage-1 map = 100%, reduce = 100% Ended Job = job_201306041757_0016 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Examining task ID: task_201306041757_0016_m_09 (and more) from job job_201306041757_0016 Task with the most failures(4): - Task ID: task_201306041757_0016_m_00 URL: http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable StringObjectInspector.java:40) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481) at
[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
[ https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676553#comment-13676553 ] Eric Hanson commented on HIVE-4665: --- I started working on this and was able to get select avg(disinternalmsft16431) from factsqlengineam_vec_orc; to run by importing DoubleWritable like so in VectorUDAFAvg.txt: import org.apache.hadoop.hive.serde2.io.DoubleWritable; instead of from org.apach.hadoop.io.DoubleWritable error at VectorExecMapper.close in group-by-agg query over ORC, vectorized -- Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException: org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator Continuing ... java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(VectorGroupByOperator); Continuing ... Starting Job = job_201306041757_0016, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job -kill job_201306041757_0016 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3 2013-06-05 10:03:06,022 Stage-1 map = 0%, reduce = 0% 2013-06-05 10:03:51,142 Stage-1 map = 100%, reduce = 100% Ended Job = job_201306041757_0016 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Examining task ID: task_201306041757_0016_m_09 (and more) from job job_201306041757_0016 Task with the most failures(4): - Task ID: task_201306041757_0016_m_00 URL: http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text at
[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
[ https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4665: -- Assignee: Jitendra Nath Pandey (was: Eric Hanson) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized -- Key: HIVE-4665 URL: https://issues.apache.org/jira/browse/HIVE-4665 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Jitendra Nath Pandey CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from FactSqlEngineAM4712; hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc group by ddate; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 3 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Validating if vectorized execution is applicable Going down the vectorization path java.lang.InstantiationException: org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator Continuing ... java.lang.Exception: XMLEncoder: discarding statement ArrayList.add(VectorGroupByOperator); Continuing ... Starting Job = job_201306041757_0016, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job -kill job_201306041757_0016 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 3 2013-06-05 10:03:06,022 Stage-1 map = 0%, reduce = 0% 2013-06-05 10:03:51,142 Stage-1 map = 100%, reduce = 100% Ended Job = job_201306041757_0016 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016 Examining task ID: task_201306041757_0016_m_09 (and more) from job job_201306041757_0016 Task with the most failures(4): - Task ID: task_201306041757_0016_m_00 URL: http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:271) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135) at org.apache.hadoop.mapred.Child.main(Child.java:265) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable StringObjectInspector.java:40) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481) at
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676561#comment-13676561 ] Ashutosh Chauhan commented on HIVE-4561: Now, {{columnstats_tbllvl.q}} failed with following exception: {code} [junit] java.lang.NullPointerException [junit] at org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyLongObjectInspector.get(LazyLongObjectInspector.java:38) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackLongStats(ColumnStatsTask.java:126) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackPrimitiveObject(ColumnStatsTask.java:196) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.unpackStructObject(ColumnStatsTask.java:224) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.constructColumnStatsFromPackedRow(ColumnStatsTask.java:263) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:327) [junit] at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343) [junit] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145) [junit] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) [junit] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355) [junit] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139) [junit] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945) [junit] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) [junit] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) [junit] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) [junit] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) [junit] at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:790) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:6279) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl(TestCliDriver.java:1971) {code} Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable
[ https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4459: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Jarek! Script hcat is overriding HIVE_CONF_DIR variable Key: HIVE-4459 URL: https://issues.apache.org/jira/browse/HIVE-4459 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Priority: Minor Fix For: 0.12.0 Attachments: bugHIVE-4459.patch Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the variable if it was set by the user. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4554: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Xuefu! Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[ https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676569#comment-13676569 ] Ashutosh Chauhan commented on HIVE-4348: I tested it on linux. Works great. +1 Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character -- Key: HIVE-4348 URL: https://issues.apache.org/jira/browse/HIVE-4348 Project: Hive Issue Type: Bug Components: HBase Handler, Testing Infrastructure, Windows Affects Versions: 0.11.0 Environment: Windows 8 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4348.patch Original Estimate: 24h Remaining Estimate: 24h The problem is because the automatically generated test case hardcoded file path string of query file using \ instead of \\ as escape character. The change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676571#comment-13676571 ] Zhuoluo (Clark) Yang commented on HIVE-4561: [~ashutoshc] I think it happens when we try to persist a null max/min,I think the simplest way is to leave it empty in the ColumnStatsTask. I will try to make a new patch and make a full test. Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4382) Fix offline build mode
[ https://issues.apache.org/jira/browse/HIVE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676572#comment-13676572 ] Brock Noland commented on HIVE-4382: FWIW, it doesn't look like this patch applies (due to other changes obviously). Fix offline build mode -- Key: HIVE-4382 URL: https://issues.apache.org/jira/browse/HIVE-4382 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Attachments: HIVE-4382.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[ https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4348: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Shuaishuai! Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character -- Key: HIVE-4348 URL: https://issues.apache.org/jira/browse/HIVE-4348 Project: Hive Issue Type: Bug Components: HBase Handler, Testing Infrastructure, Windows Affects Versions: 0.11.0 Environment: Windows 8 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Fix For: 0.12.0 Attachments: HIVE-4348.patch Original Estimate: 24h Remaining Estimate: 24h The problem is because the automatically generated test case hardcoded file path string of query file using \ instead of \\ as escape character. The change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4657: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Shreepadma! HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Fix For: 0.12.0 Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4641) Support post execution/fetch hook for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676588#comment-13676588 ] Carl Steinbach commented on HIVE-4641: -- Ok, but can you give me an actual example of something that requires this functionality? What kind of information do you plan to make available to the hook? I also think we need to be really careful about providing a way for people to mutate the resultset after it has been generated since this work will be done on the server node in a non-distributed fashion. Support post execution/fetch hook for HiveServer2 - Key: HIVE-4641 URL: https://issues.apache.org/jira/browse/HIVE-4641 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Support post execution/fetch hook that is invoked prior to returning results to the client. This can be used to filter results before returning the result set to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4422) Test output need to be updated for Windows only unit test in TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676593#comment-13676593 ] Ashutosh Chauhan commented on HIVE-4422: +1 Test output need to be updated for Windows only unit test in TestCliDriver -- Key: HIVE-4422 URL: https://issues.apache.org/jira/browse/HIVE-4422 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4422.1.patch Update the Windows only unit test expected output for combine2_win.q input_part10_win.q and load_dyn_part14_win.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4669) Make username available to semantic analyzer hooks
Shreepadma Venugopalan created HIVE-4669: Summary: Make username available to semantic analyzer hooks Key: HIVE-4669 URL: https://issues.apache.org/jira/browse/HIVE-4669 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0, 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Make username available to the semantic analyzer hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4669) Make username available to semantic analyzer hooks
[ https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4669: - Status: Patch Available (was: In Progress) Make username available to semantic analyzer hooks -- Key: HIVE-4669 URL: https://issues.apache.org/jira/browse/HIVE-4669 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0, 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4669.1.patch Make username available to the semantic analyzer hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4669) Make username available to semantic analyzer hooks
[ https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4669: - Attachment: HIVE-4669.1.patch Make username available to semantic analyzer hooks -- Key: HIVE-4669 URL: https://issues.apache.org/jira/browse/HIVE-4669 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0, 0.11.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4669.1.patch Make username available to the semantic analyzer hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-4669) Make username available to semantic analyzer hooks
[ https://issues.apache.org/jira/browse/HIVE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-4669 started by Shreepadma Venugopalan. Make username available to semantic analyzer hooks -- Key: HIVE-4669 URL: https://issues.apache.org/jira/browse/HIVE-4669 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0, 0.11.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4669.1.patch Make username available to the semantic analyzer hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676598#comment-13676598 ] Shreepadma Venugopalan commented on HIVE-4561: -- [~clarkyzl]: I'm not sure I understand the fix here. Can you please elaborate on what it means to leaving it empty in the ColumnStatsTask? Thanks! Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4390) Enable capturing input URI entities for DML statements
[ https://issues.apache.org/jira/browse/HIVE-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676600#comment-13676600 ] Prasad Mujumdar commented on HIVE-4390: --- [~ashutoshc] Thanks for taking a look. The patch adds a new type of objects passed to the hooks. This could cause problems for an existing hooks that's not expecting the new type. We can keep this enabled by default, but a config to turn it off would be useful. Enable capturing input URI entities for DML statements -- Key: HIVE-4390 URL: https://issues.apache.org/jira/browse/HIVE-4390 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4390-2.patch The query compiler doesn't capture the files or directories accessed by following statements - * Load data * Export * Import * Alter table/partition set location This is very useful information to access from the hooks for monitoring/auditing etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-4669. Make username available to semantic analyzer hooks
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11661/ --- Review request for hive, Ashutosh Chauhan and Navis Ryu. Description --- Makes user name available to the semantic analyzer hooks. This addresses bug HIVE-4669. https://issues.apache.org/jira/browse/HIVE-4669 Diffs - ql/src/java/org/apache/hadoop/hive/ql/Driver.java a5a867a ql/src/java/org/apache/hadoop/hive/ql/parse/HiveSemanticAnalyzerHookContext.java ae371f3 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveSemanticAnalyzerHookContextImpl.java 9c3377e Diff: https://reviews.apache.org/r/11661/diff/ Testing --- Thanks, Shreepadma Venugopalan
[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[ https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676603#comment-13676603 ] Shuaishuai Nie commented on HIVE-4348: -- Thanks [~ashutoshc] Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character -- Key: HIVE-4348 URL: https://issues.apache.org/jira/browse/HIVE-4348 Project: Hive Issue Type: Bug Components: HBase Handler, Testing Infrastructure, Windows Affects Versions: 0.11.0 Environment: Windows 8 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Fix For: 0.12.0 Attachments: HIVE-4348.patch Original Estimate: 24h Remaining Estimate: 24h The problem is because the automatically generated test case hardcoded file path string of query file using \ instead of \\ as escape character. The change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4670) Authentication module should pass the instance part of the Kerberos principle
Shreepadma Venugopalan created HIVE-4670: Summary: Authentication module should pass the instance part of the Kerberos principle Key: HIVE-4670 URL: https://issues.apache.org/jira/browse/HIVE-4670 Project: Hive Issue Type: Bug Components: Authentication, HiveServer2 Affects Versions: 0.11.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan When Kerberos authentication is enabled for HiveServer2, the thrift SASL layer passes instance@realm from the principal. It should instead strip the realm and pass just the instance part of the principal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4668) wrong results for query with modulo (%) in WHERE clause filter
[ https://issues.apache.org/jira/browse/HIVE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarvesh Sakalanaga reassigned HIVE-4668: Assignee: Sarvesh Sakalanaga wrong results for query with modulo (%) in WHERE clause filter -- Key: HIVE-4668 URL: https://issues.apache.org/jira/browse/HIVE-4668 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Sarvesh Sakalanaga select disinternalmsft16431, count(disinternalmsft16431) from factsqlengineam_vec_orc where ddate = 2012-12 and ddate 2013-02 and disinternalmsft16431 % 5 = 0 group by disinternalmsft16431 Expected result: 0 3160232 5 33039254 Actual result: 0 8697033 6 2706407 5 94709959 There should be no result row for 6 because 6 % 5 != 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4666) Count(*) over tpch lineitem ORC results in Error: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarvesh Sakalanaga reassigned HIVE-4666: Assignee: Sarvesh Sakalanaga Count(*) over tpch lineitem ORC results in Error: Java heap space - Key: HIVE-4666 URL: https://issues.apache.org/jira/browse/HIVE-4666 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Assignee: Sarvesh Sakalanaga Fix For: vectorization-branch Attachments: output Executing the following query over an orc tpch line item table fails due to Error: Java heap space {noformat} INSERT OVERWRITE LOCAL DIRECTORY 'd:\\count_output' SELECT Count(*) AS count_order FROM lineitem_orc {noformat} the line item table in approximately 1gb in size. This error happens in both non-vectorized and vectorized modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4422) Test output need to be updated for Windows only unit test in TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4422: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Shuaishuai! Test output need to be updated for Windows only unit test in TestCliDriver -- Key: HIVE-4422 URL: https://issues.apache.org/jira/browse/HIVE-4422 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Fix For: 0.12.0 Attachments: HIVE-4422.1.patch Update the Windows only unit test expected output for combine2_win.q input_part10_win.q and load_dyn_part14_win.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676611#comment-13676611 ] Zhuoluo (Clark) Yang commented on HIVE-4561: [~shreepadma] I think I am wrong. Originally, I want to return like this: {code} @@ -189,6 +187,11 @@ statsObj.setStatsData(statsData); } } else { + // Any null object, such as min/max value of an empty table, + // need not be unpacked. + if (o == null) { +return; + } // invoke the right unpack method depending on data type of the column if (statsObj.getStatsData().isSetBooleanStats()) { unpackBooleanStats(oi, o, fieldName, statsObj); {code} However, I've found that LongColumnStatsData.highValue is required by thrift. And also modifications of ObjectStore is required and checks LongColumnStatsData.isSetHighValue(). Any suggestions? Thanks! Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch, HIVE-4561.3.patch, HIVE-4561.4.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3910) Create a new DATE datatype
[ https://issues.apache.org/jira/browse/HIVE-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676626#comment-13676626 ] Jason Dere commented on HIVE-3910: -- HIVE-4055 already has a patch with an initial implementation of a DATE type, which has already done quite a bit of the work for DATE support. Took a look at this and I had a few proposed additions to this: 1. Use Joda Time rather than java.sql.Date The existing patch uses java.sql.Date as the underlying data type (based on java.util.Date). Thejas proposed using the Joda Time library as this is supposed to be a better datetime implementation, and is also used by Pig for datetime handling. It does not appear that Joda Time is currently used by Hive and so this would need to be pulled in as a dependent library. 2. Storage of DATE values In the existing patch, DateWritable writes out long value (8 bytes) representing seconds since the Unix epoch. As mentioned in HIVE-3910, since DATE is in days, we could reduce the storage space by instead storing a 4-byte integer value representing days since some epoch (1970? 4713 BC?). The range of dates that we can represent with such an integer representation would be +/- 2 billion days, or 5.8M years. 3. Considerations for Hive vectorization support Talking to some folks who are concerned about Hive vectorization (HIVE-4160), and in the interests of vectorization support they want the date type to be represented as primitive values. They are proposing that DateWritable would hold the integer value (rather than Date value) which will still be usable for comparison operations, which would be the most common operations that would be used on date types (group-by, sorting). If an actual Date value is required, then DateWritable.get() will generate a Date object based on the days-since-epoch integer value. 4. SQL syntax compliance The existing patch creates date values using a DATE() UDF - DATE('2013-01-01). The SQL standard actually has syntax to represent a date literal - DATE '2013-01-01'. The Hive grammar would need to be extended to support the SQL syntax. 5. Operations on DATE types The SQL standard (section 6.14) looks like it just supports DATE operations involving the INTERVAL type: datetime value expression ::= datetime term | interval value expression plus sign datetime term | datetime value expression plus sign interval term | datetime value expression minus sign interval term There is currently no interval type support in Hive. Support for the interval type will be added as a later item. 6. Compatibility with other types The existing patch allows a lot of implicit conversion to/from other types (numeric, string). It does appear that TIMESTAMP has set a bit of a precedent in allowing a lot of implicit type conversion. However, given the limited operations with other types as described in above from the SQL standard, I would propose limiting the amount of implicit conversion that is allowed. There are UDFs that the user can use to convert DATE into numeric/string values, which can then be used in arithmetic or aggregation functions. Create a new DATE datatype -- Key: HIVE-3910 URL: https://issues.apache.org/jira/browse/HIVE-3910 Project: Hive Issue Type: Task Reporter: Namit Jain It might be useful to have a DATE datatype along with timestamp. This can only store the day (possibly number of days from 1970-01-01, and would thus give space savings in binary format). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4055) add Date data type
[ https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676627#comment-13676627 ] Jason Dere commented on HIVE-4055: -- Hi Sun Rui, I made a few comments on possible additions to your proposed patch at HIVE-3910. add Date data type -- Key: HIVE-4055 URL: https://issues.apache.org/jira/browse/HIVE-4055 Project: Hive Issue Type: Sub-task Components: JDBC, Query Processor, Serializers/Deserializers, UDF Reporter: Sun Rui Attachments: HIVE-4055.1.patch.txt Add Date data type, a new primitive data type which supports the standard SQL date type. Basically, the implementation can take HIVE-2272 and HIVE-2957 as references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4459) Script hcat is overriding HIVE_CONF_DIR variable
[ https://issues.apache.org/jira/browse/HIVE-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676636#comment-13676636 ] Hudson commented on HIVE-4459: -- Integrated in Hive-trunk-hadoop2 #227 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/227/]) HIVE-4459 : Script hcat is overriding HIVE_CONF_DIR variable (Jarek Jarcec Cecho via Ashutosh Chauhan) (Revision 1490100) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490100 Files : * /hive/trunk/hcatalog/bin/hcat Script hcat is overriding HIVE_CONF_DIR variable Key: HIVE-4459 URL: https://issues.apache.org/jira/browse/HIVE-4459 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Priority: Minor Fix For: 0.12.0 Attachments: bugHIVE-4459.patch Script {{hcat}} is currently overriding variable {{HIVE_CONF_DIR}} to {{$\{HIVE_HOME}/conf}}. It would be useful to use the previous content of the variable if it was set by the user. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2670) A cluster test utility for Hive
[ https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676632#comment-13676632 ] Hudson commented on HIVE-2670: -- Integrated in Hive-trunk-hadoop2 #227 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/227/]) HIVE-4657 : HCatalog checkstyle violation after HIVE-2670 (Shreepadma Venugopalan via Ashutosh Chauhan) (Revision 1490106) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490106 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf A cluster test utility for Hive --- Key: HIVE-2670 URL: https://issues.apache.org/jira/browse/HIVE-2670 Project: Hive Issue Type: New Feature Components: Testing Infrastructure Reporter: Alan Gates Assignee: Johnny Zhang Fix For: 0.12.0 Attachments: harness.tar, HIVE-2670_5.patch, HIVE-2670_6.patch, hive_cluster_test_2.patch, hive_cluster_test_3.patch, hive_cluster_test_4.patch, hive_cluster_test.patch Hive has an extensive set of unit tests, but it does not have an infrastructure for testing in a cluster environment. Pig and HCatalog have been using a test harness for cluster testing for some time. We have written Hive drivers and tests to run in this harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4348) Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character
[ https://issues.apache.org/jira/browse/HIVE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676634#comment-13676634 ] Hudson commented on HIVE-4348: -- Integrated in Hive-trunk-hadoop2 #227 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/227/]) HIVE-4348 : Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character (Shuaishuai Nie via Ashutosh Chauhan) (Revision 1490103) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490103 Files : * /hive/trunk/hbase-handler/src/test/templates/TestHBaseCliDriver.vm * /hive/trunk/hbase-handler/src/test/templates/TestHBaseNegativeCliDriver.vm Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character -- Key: HIVE-4348 URL: https://issues.apache.org/jira/browse/HIVE-4348 Project: Hive Issue Type: Bug Components: HBase Handler, Testing Infrastructure, Windows Affects Versions: 0.11.0 Environment: Windows 8 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Fix For: 0.12.0 Attachments: HIVE-4348.patch Original Estimate: 24h Remaining Estimate: 24h The problem is because the automatically generated test case hardcoded file path string of query file using \ instead of \\ as escape character. The change should be in the TestHBaseCliDriver.vm and TestHBaseNegativeCliDriver.vm -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4657) HCatalog checkstyle violation after HIVE-2670
[ https://issues.apache.org/jira/browse/HIVE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676633#comment-13676633 ] Hudson commented on HIVE-4657: -- Integrated in Hive-trunk-hadoop2 #227 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/227/]) HIVE-4657 : HCatalog checkstyle violation after HIVE-2670 (Shreepadma Venugopalan via Ashutosh Chauhan) (Revision 1490106) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490106 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf * /hive/trunk/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf HCatalog checkstyle violation after HIVE-2670 -- Key: HIVE-4657 URL: https://issues.apache.org/jira/browse/HIVE-4657 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Fix For: 0.12.0 Attachments: HIVE-4657.1.patch After HIVE-2670 was committed, I see the following error, {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 416 files [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/drivers/TestDriverHiveCmdLine.pm:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_cmdline.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [checkstyle] /Users/vshree/work/repositories/hive15/hcatalog/src/test/e2e/hcatalog/tests/hive_nightly.conf:1: Line does not match expected header line of '\W*or more contributor license agreements. See the NOTICE file$'. [for] hcatalog: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/build.xml:310: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /Users/vshree/work/repositories/hive15/hcatalog/build-support/ant/checkstyle.xml:32: Got 3 errors and 0 warnings. BUILD FAILED /Users/vshree/work/repositories/hive15/build.xml:308: Keepgoing execution: 2 of 11 iterations failed. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676637#comment-13676637 ] Hudson commented on HIVE-4554: -- Integrated in Hive-trunk-hadoop2 #227 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/227/]) HIVE-4554 : Failed to create a table from existing file if file path has spaces (Xuefu Zhang via Ashutosh Chauhan) (Revision 1490101) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1490101 Files : * /hive/trunk/build-common.xml * /hive/trunk/data/files/person age.txt * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java * /hive/trunk/ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q * /hive/trunk/ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q * /hive/trunk/ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out * /hive/trunk/ql/src/test/results/clientpositive/load_hdfs_file_with_space_in_the_name.q.out Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3, HIVE-4554.patch.4, HIVE-4554.patch.5 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-hadoop2 - Build # 227 - Failure
Changes for Build #199 [omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions (Gunther Hagleitner via omalley) [omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther Hagleitner via omalley) Changes for Build #200 Changes for Build #201 [omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there are many partitions (Gopal V via omalley) Changes for Build #202 Changes for Build #203 Changes for Build #204 Changes for Build #205 [omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther Hagleitner via omalley) [omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther Hagleitner via omalley) Changes for Build #206 [gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) Changes for Build #207 [gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table formatting (gates) Changes for Build #208 Changes for Build #209 [navis] JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL (Richard Ding via Navis) Changes for Build #210 Changes for Build #211 Changes for Build #212 Changes for Build #213 Changes for Build #214 [navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap (Yin Huai via Navis) [navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true (Gunther Hagleitner via Navis) Changes for Build #215 Changes for Build #216 Changes for Build #217 [gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty Leverenz via gates) Changes for Build #218 Changes for Build #219 [daijy] PIG-3337: Fix remaining Window e2e tests Changes for Build #220 Changes for Build #221 Changes for Build #222 Changes for Build #223 [hashutosh] HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via Ashutosh Chauhan) [hashutosh] HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4626 : join_vc.q is not deterministic (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) [hashutosh] HIVE-4489 : beeline always return the same error message twice (Chaoyu Tang via Ashutosh Chauhan) [hashutosh] HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas Nair via Ashutosh Chauhan) [hashutosh] HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via Ashutosh Chauhan) Changes for Build #224 [hashutosh] HIVE-4615 : Invalid column names allowed when created dynamically by a SerDe (Gabriel Reid via Ashutosh Chauhan) [hashutosh] HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via Ashutosh Chauhan) [hashutosh] HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters (Chu Tong via Ashutosh Chauhan) Changes for Build #225 [gates] HIVE-2670 A cluster test utility for Hive (gates and Johnny Zhang via gates) [hashutosh] HIVE-4585 : Remove unused MR Temp file localization from Tasks (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-4418 : TestNegativeCliDriver failure message if cmd succeeds is misleading (Thejas Nair via Ashutosh Chauhan) [navis] HIVE-4620 MR temp directory conflicts in case of parallel execution mode (Prasad Mujumdar via Navis) Changes for Build #226 [hashutosh] HIVE-2304 : Support PreparedStatement.setObject (Ido Hadanny via Ashutosh Chauhan) [hashutosh] HIVE-4526 : auto_sortmerge_join_9.q throws NPE but test is succeeded (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4516 : Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java (Jon Hartlaub and Navis via Ashutosh Chauhan) [hashutosh] HIVE-4566 : NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established (Xuefu Zhang via Ashutosh Chauhan) [hashutosh] HIVE-4646 : skewjoin.q is failing in hadoop2 (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4377 : Add more comment to https://reviews.facebook.net/D1209 (HIVE2340) : (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4546 : Hive CLI leaves behind the per session resource directory on non-interactive invocation (Prasad Mujumdar via Ashutosh Chauhan) Changes for Build #227 [hashutosh] HIVE-4422 : Test output need to be updated for Windows only unit test in TestCliDriver (Shuaishuai Nie via Ashutosh Chauhan) [hashutosh] HIVE-4657 : HCatalog checkstyle violation after HIVE-2670 (Shreepadma Venugopalan via Ashutosh Chauhan) [hashutosh] HIVE-4348 : Unit test compile fail at hbase-handler project on Windows becuase of illegal escape character (Shuaishuai Nie via Ashutosh Chauhan) [hashutosh] HIVE-4554 : Failed to create a table from existing file if file path has spaces (Xuefu Zhang via Ashutosh