[jira] [Commented] (HIVE-7052) Optimize split calculation time
[ https://issues.apache.org/jira/browse/HIVE-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995994#comment-13995994 ] Rajesh Balamohan commented on HIVE-7052: https://reviews.apache.org/r/21357/diff/#index_header Optimize split calculation time --- Key: HIVE-7052 URL: https://issues.apache.org/jira/browse/HIVE-7052 Project: Hive Issue Type: Bug Environment: hive + tez Reporter: Rajesh Balamohan Assignee: Rajesh Balamohan Labels: performance Attachments: HIVE-7052-profiler-1.png, HIVE-7052-profiler-2.png When running a TPC-DS query (query_27), significant amount of time was spent in split computation on a dataset of size 200 GB (ORC format). Profiling revealed that, 1. Lot of time was spent in Config's subtitutevar (regex) in HiveInputFormat.getSplits() method. 2. FileSystem was created repeatedly in OrcInputFormat.generateSplitsInfo(). I will attach the profiler snapshots soon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6986) MatchPath fails with small resultExprString
[ https://issues.apache.org/jira/browse/HIVE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6986: --- Status: Patch Available (was: Open) OK. +1 MatchPath fails with small resultExprString --- Key: HIVE-6986 URL: https://issues.apache.org/jira/browse/HIVE-6986 Project: Hive Issue Type: Bug Components: UDF Reporter: Furcy Pin Priority: Trivial Attachments: HIVE-6986.1.patch When using MatchPath, a query like this: select year from matchpath(on flights_tiny sort by fl_num, year, month, day_of_month arg1('LATE.LATE+'), arg2('LATE'), arg3(arr_delay 15), arg4('year') ) ; will fail with error message FAILED: StringIndexOutOfBoundsException String index out of range: 6 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7060) Column stats give incorrect min and distinct_count
[ https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7060: -- Description: It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite PREHOOK: type: DESCTABLE PREHOOK: Input: default@uservisits_web_text_none POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@uservisits_web_text_none # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} was: It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeO from UserVisits_web_text_nonenSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite PREHOOK: type: DESCTABLE PREHOOK: Input: default@uservisits_web_text_none POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@uservisits_web_text_none # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} Column stats give incorrect min and distinct_count -- Key: HIVE-7060 URL: https://issues.apache.org/jira/browse/HIVE-7060 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.13.0 Reporter: Xuefu Zhang It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite PREHOOK: type: DESCTABLE PREHOOK: Input: default@uservisits_web_text_none POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@uservisits_web_text_none # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
[ https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996997#comment-13996997 ] Xuefu Zhang commented on HIVE-7050: --- Thanks for the patch. Minor comments/questions on RB. One clarification: are column stats shown when either EXTENDED or FORMATTED is specified? And only when column is specified? I think this is important for documentation purpose. It would be good if functional details can be put in the description area. Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE - Key: HIVE-7050 URL: https://issues.apache.org/jira/browse/HIVE-7050 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7050.1.patch There is currently no way to display the column level stats from hive CLI. It will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated May 14, 2014, 8:22 p.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 conf/hive-default.xml.template 2552560 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 2dbe334 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory.java accc312 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory2.java 1bd2352 ql/src/java/org/apache/hadoop/hive/ql/Driver.java fce77a8 ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java f5d4670 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java d4be78d ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 674ed48 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java f7b499b ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 093da55 ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 ql/src/test/queries/clientpositive/tez_union.q f80d94c ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out cb11b8b ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 1c16024 ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 614a4a6 serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java 9079b9d serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java 1b09d41 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 5870884 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java bab505e serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 6f344bb serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java a99c7b4 serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 435d6c6 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 82c1263 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java b188c3f serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java 98a35c7 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 6c14081 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/objectinspector/LazyBinaryStructObjectInspector.java e5ea452 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 06d5c5e serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 868dd4c serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java 1fb49e5 Diff: https://reviews.apache.org/r/18936/diff/ Testing
[jira] [Assigned] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter
[ https://issues.apache.org/jira/browse/HIVE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-7048: - Assignee: Swarnim Kulkarni CompositeKeyHBaseFactory should not use FamilyFilter Key: HIVE-7048 URL: https://issues.apache.org/jira/browse/HIVE-7048 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Priority: Blocker HIVE-6411 introduced a more generic way to provide composite key implementations via custom factory implementations. However it seems like the CompositeHBaseKeyFactory implementation uses a FamilyFilter for row key scans which doesn't seem appropriate. This should be investigated further and if possible replaced with a RowRangeScanFilter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7036) get_json_object bug when extract list of list with index
[ https://issues.apache.org/jira/browse/HIVE-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7036: Attachment: HIVE-7036.1.patch.txt get_json_object bug when extract list of list with index Key: HIVE-7036 URL: https://issues.apache.org/jira/browse/HIVE-7036 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.12.0 Environment: all Reporter: Ming Ma Priority: Minor Labels: udf Attachments: HIVE-7036.1.patch.txt https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFJson.java#L250 this line should be out of the for-loop For example json = '{h:[1, [2, 3], {i: 0}, [{p: 11}, {p: 12}, {pp: 13}]}' get_json_object(json, '$.h[*][0]') should return back the first node(if exists) of every childrenof '$.h' which specifically should be [2,{p:11}] but hive returns only 2 because when hive pick the node '2' out, the tmp_jsonList will change to a list only contains one node '2': [2] then it was assigned to variable jsonList, in the next loop, value of i would be 2 which is greater than the size(always 1) of jsonList, then the loop broke out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6430: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) committed to trunk MapJoin hash table has large memory overhead Key: HIVE-6430 URL: https://issues.apache.org/jira/browse/HIVE-6430 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, HIVE-6430.14.patch, HIVE-6430.patch Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 for row) can take several hundred bytes, which is ridiculous. I am reducing the size of MJKey and MJRowContainer in other jiras, but in general we don't need to have java hash table there. We can either use primitive-friendly hashtable like the one from HPPC (Apache-licenced), or some variation, to map primitive keys to single row storage structure without an object per row (similar to vectorization). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7041) DoubleWritable/ByteWritable should extend their hadoop counterparts
[ https://issues.apache.org/jira/browse/HIVE-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996583#comment-13996583 ] Hive QA commented on HIVE-7041: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12644205/HIVE-7041.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/190/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/190/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12644205 DoubleWritable/ByteWritable should extend their hadoop counterparts --- Key: HIVE-7041 URL: https://issues.apache.org/jira/browse/HIVE-7041 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7041.1.patch Hive has its own implementations of ByteWritable/DoubleWritable/ShortWritable. We cannot replace usage of these classes since they will break 3rd party UDFs/SerDes, however we can at least extend from the Hadoop version of these classes when possible to avoid duplicate code. When Hive finally moves to version 1.0 we might want to consider removing use of these Hive-specific writables and switching over to using the Hadoop version of these classes. ShortWritable didn't exist in Hadoop until 2.x so it looks like we can't do it with this class until 0.20/1.x support is dropped from Hive. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
On May 8, 2014, 10:05 p.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java, line 405 https://reviews.apache.org/r/18936/diff/13/?file=572109#file572109line405 why do you need this? this seems to do the same thing as tag == -1? it's more explicit and stays that way if someone resets tag later On May 8, 2014, 10:05 p.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java, line 470 https://reviews.apache.org/r/18936/diff/13/?file=572109#file572109line470 this should exist on the operator, but on the ReduceSinkDesc when we set it, we are operating on already-created operator - Sergey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/#review42539 --- On May 9, 2014, 8:16 p.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated May 9, 2014, 8:16 p.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 conf/hive-default.xml.template 2552560 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 142bfd8 ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java f5d4670 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java d4be78d ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 674ed48 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java f7b499b ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 093da55 ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 ql/src/test/queries/clientpositive/tez_union.q f80d94c ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 8350670 ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java 9079b9d serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java 1b09d41 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 5870884 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java bab505e serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 6f344bb serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd
[jira] [Commented] (HIVE-4803) LazyTimestamp should accept numeric values
[ https://issues.apache.org/jira/browse/HIVE-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992701#comment-13992701 ] Hive QA commented on HIVE-4803: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643698/HIVE-4803.2.patch.txt {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5496 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_numerics org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/144/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/144/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643698 LazyTimestamp should accept numeric values -- Key: HIVE-4803 URL: https://issues.apache.org/jira/browse/HIVE-4803 Project: Hive Issue Type: Improvement Components: Types Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-4803.2.patch.txt, HIVE-4803.D11565.1.patch LazyTimestamp accepts -mm-dd hh:mm:ss formatted string and 'NULL'. It would be good to accept numeric form (which is milliseconds). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993006#comment-13993006 ] Thejas M Nair commented on HIVE-6768: - [~ekoifman] Can you respond on Ashutosh's comment ? Looks like most of the changes in HIVE-5511 are not specific to the issue, but were general cleanup. And the attached patch reverts changes that were specific to the log handling. remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties --- Key: HIVE-6768 URL: https://issues.apache.org/jira/browse/HIVE-6768 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6768.patch now that MAPREDUCE-5806 is fixed we can remove override-container-log4j.properties and and all the logic around this which was introduced in HIVE-5511 to work around MAPREDUCE-5806 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7045) Wrong results in multi-table insert aggregating without group by clause
[ https://issues.apache.org/jira/browse/HIVE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7045: Priority: Blocker (was: Major) Wrong results in multi-table insert aggregating without group by clause --- Key: HIVE-7045 URL: https://issues.apache.org/jira/browse/HIVE-7045 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.12.0 Reporter: dima machlin Priority: Blocker This happens whenever there are more than 1 reducers. The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; t1 contains : 1 1 2 2 from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec
[ https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992839#comment-13992839 ] Hive QA commented on HIVE-6809: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643703/HIVE-6809.5.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5429 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643703 Support bulk deleting directories for partition drop with partial spec -- Key: HIVE-6809 URL: https://issues.apache.org/jira/browse/HIVE-6809 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt In busy hadoop system, dropping many of partitions takes much more time than expected. In hive-0.11.0, removing 1700 partitions by single partial spec took 90 minutes, which is reduced to 3 minutes when deleteData is set false. I couldn't test this in recent hive, which has HIVE-6256 but if the time-taking part is mostly from removing directories, it seemed not helpful to reduce whole processing time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4576) templeton.hive.properties does not allow values with commas
[ https://issues.apache.org/jira/browse/HIVE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4576: --- Fix Version/s: 0.13.1 templeton.hive.properties does not allow values with commas --- Key: HIVE-4576 URL: https://issues.apache.org/jira/browse/HIVE-4576 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.5.0 Reporter: Vitaliy Fuks Assignee: Eugene Koifman Fix For: 0.14.0, 0.13.1 Attachments: HIVE-4576.0.13.patch, HIVE-4576.2.patch, HIVE-4576.patch templeton.hive.properties accepts a comma-separated list of key=value property pairs that will be passed to Hive. However, this makes it impossible to use any value that itself has a comma in it. For example: {code:xml}property nametempleton.hive.properties/name valuehive.metastore.sasl.enabled=false,hive.metastore.uris=thrift://foo1.example.com:9083,foo2.example.com:9083/value /property{code} {noformat}templeton: starting [/usr/bin/hive, --service, cli, --hiveconf, hive.metastore.sasl.enabled=false, --hiveconf, hive.metastore.uris=thrift://foo1.example.com:9083, --hiveconf, foo2.example.com:9083 etc..{noformat} because the value is parsed using standard org.apache.hadoop.conf.Configuration.getStrings() call which simply splits on commas from here: {code:java}for (String prop : appConf.getStrings(AppConfig.HIVE_PROPS_NAME)){code} This is problematic for any hive property that itself has multiple values, such as hive.metastore.uris above or hive.aux.jars.path. There should be some way to escape commas or a different delimiter should be used. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7042) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7042: - Status: Open (was: Patch Available) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2 -- Key: HIVE-7042 URL: https://issues.apache.org/jira/browse/HIVE-7042 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7042.1.patch stats_partscan_1_23.q and orc_createas1.q should use HiveInputFormat as opposed to CombineHiveInputFormat. RCFile uses DefaultCodec for compression (uses DEFLATE) which is not splittable. Hence using CombineHiveIF will yield different results for these tests. ORC should use HiveIF to generate ORC splits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7062) Support Streaming mode in Windowing
Harish Butani created HIVE-7062: --- Summary: Support Streaming mode in Windowing Key: HIVE-7062 URL: https://issues.apache.org/jira/browse/HIVE-7062 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani 1. Have the Windowing Table Function support streaming mode. 2. Have special handling for Ranking UDAFs. 3. Have special handling for Sum/Avg for fixed size Wdws. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6976) Show query id only when there's jobs on the cluster
[ https://issues.apache.org/jira/browse/HIVE-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6976: - Status: Patch Available (was: Open) Show query id only when there's jobs on the cluster --- Key: HIVE-6976 URL: https://issues.apache.org/jira/browse/HIVE-6976 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Attachments: HIVE-6976.1.patch No need to print the query id for local-only execution. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6945) issues with dropping partitions on Oracle
[ https://issues.apache.org/jira/browse/HIVE-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6945: --- Fix Version/s: 0.13.1 issues with dropping partitions on Oracle - Key: HIVE-6945 URL: https://issues.apache.org/jira/browse/HIVE-6945 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.14.0, 0.13.1 Attachments: HIVE-6945-0.13.1.patch, HIVE-6945.01.patch, HIVE-6945.02.patch, HIVE-6945.patch 1) Direct SQL is broken on Oracle due to the usage of NUMBER type which is translated by DN into decimal rather than long. This appears to be specific to some cases because it seemed to have worked before (different version of Oracle? JDBC? DN? Maybe depends on whether db was auto-created). 2) When partition dropping code falls back to JDO, it creates objects to return, then drops partitions. It appears that dropping makes DN objects invalid. We create metastore partition objects out of DN objects before drop, however the list of partition column values is re-used, rather than copied, into these. DN appears to clear this list during drop, so the returned object becomes invalid and the exception is thrown. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated HIVE-7054: - Attachment: HIVE-7054.2.patch Support ELT UDF in vectorized mode -- Key: HIVE-7054 URL: https://issues.apache.org/jira/browse/HIVE-7054 Project: Hive Issue Type: New Feature Components: Vectorization Affects Versions: 0.14.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7054.2.patch, HIVE-7054.patch Implement support for ELT udf in vectorized execution mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995651#comment-13995651 ] Swarnim Kulkarni commented on HIVE-6411: [~xuefuz] Done. I have also linked to this newly created issue. [1] https://issues.apache.org/jira/browse/HIVE-7048 Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7011) HiveInputFormat's split generation isn't thread safe
[ https://issues.apache.org/jira/browse/HIVE-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7011: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the review [~vikram.dixit]! HiveInputFormat's split generation isn't thread safe Key: HIVE-7011 URL: https://issues.apache.org/jira/browse/HIVE-7011 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-7011.1.patch Tez will do split generation in parallel. Need to protect the inputformat cache against concurrent access. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 21289: HIVE-7033 : grant statements should check if the role exists
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21289/#review42625 --- metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java https://reviews.apache.org/r/21289/#comment76447 This should be done within transaction. Else, this may result in TOCTU bug. - Ashutosh Chauhan On May 9, 2014, 11:14 p.m., Thejas Nair wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21289/ --- (Updated May 9, 2014, 11:14 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-7033 https://issues.apache.org/jira/browse/HIVE-7033 Repository: hive-git Description --- The following grant statement that grants to a role that does not exist succeeds, but it should result in an error. grant all on t1 to role nosuchrole; Patch also fixes the handling of role names in some cases to be case insensitive. Diffs - metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 4b4f4f2 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrincipal.java 62b8994 ql/src/test/queries/clientnegative/authorization_role_grant_nosuchrole.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_table_grant_nosuchrole.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_1_sql_std.q 79ae17a ql/src/test/queries/clientpositive/authorization_role_grant1.q f89d0dc ql/src/test/queries/clientpositive/authorization_role_grant2.q 984d7ed ql/src/test/results/clientnegative/authorization_role_grant_nosuchrole.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_table_grant_nosuchrole.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 718ff31 ql/src/test/results/clientpositive/authorization_role_grant1.q.out 3c846eb ql/src/test/results/clientpositive/authorization_role_grant2.q.out 1e8f88a Diff: https://reviews.apache.org/r/21289/diff/ Testing --- New tests included Thanks, Thejas Nair
[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found
[ https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7035: - Attachment: HIVE-7035.patch Templeton returns 500 for user errors - when job cannot be found Key: HIVE-7035 URL: https://issues.apache.org/jira/browse/HIVE-7035 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7035.patch curl -i 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman' should return HTTP Status code 4xx when no such job exists; it currently returns 500. {noformat} {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_201304291205_0015' doesn't exist in RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol PBServiceImpl.java:120)\r\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat org.apache.hado op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Serve r.java:2053)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat java.security.AccessController.doPrivileged(Native Method)\r\n\tat javax.security.auth.Subject.doAs(Subject.ja va:415)\r\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n} {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6952) Hive 0.13 HiveOutputFormat breaks backwards compatibility
[ https://issues.apache.org/jira/browse/HIVE-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6952: --- Fix Version/s: 0.13.1 Hive 0.13 HiveOutputFormat breaks backwards compatibility - Key: HIVE-6952 URL: https://issues.apache.org/jira/browse/HIVE-6952 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Costin Leau Assignee: Ashutosh Chauhan Priority: Blocker Fix For: 0.14.0, 0.13.1 Attachments: HIVE-6952.patch, HIVE-6952_branch-13.patch Hive 0.13 changed the signature of HiveOutputFormat (through commit r1527149) breaking backwards compatibility with previous releases; the return type of getHiveRecordWriter has been changed from RecordWriter to FSRecordWriter. FSRecordWriter introduces one new method on top of RecordWriter however it does not extend the previous interface and it lives in a completely new package. Thus code running fine on Hive 0.12 breaks on Hive 0.13. After the upgrade, code running on HIve 0.13, will break on anything lower than this. This could have easily been avoided by extending the existing interface or introducing a new one that RecordWriter could have extended going forward. By changing the signature, the existing contract (and compatibility) has been voided. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys
[ https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997550#comment-13997550 ] Xuefu Zhang commented on HIVE-6290: --- User doc should go with HIVE-6411 also. Add support for hbase filters for composite keys Key: HIVE-6290 URL: https://issues.apache.org/jira/browse/HIVE-6290 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Fix For: 0.14.0 Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, HIVE-6290.3.patch.txt Add support for filters to be provided via the composite key class -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS
[ https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993822#comment-13993822 ] Thejas M Nair commented on HIVE-7031: - +1 testCliDriver_schemeAuthority2 is a flaky test, it passed when I ran locally. Other tests failures are unrelated. Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS --- Key: HIVE-7031 URL: https://issues.apache.org/jira/browse/HIVE-7031 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-7031.1.patch This leads to inconsitent HDFS naming for empty partition/tables where a file might be named as hdfs://headnode0:9000/hive/scratch/hive_2 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating system -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5764) Stopping Metastore and HiveServer2 from command line
[ https://issues.apache.org/jira/browse/HIVE-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5764: Assignee: Xiaobing Zhou Stopping Metastore and HiveServer2 from command line Key: HIVE-5764 URL: https://issues.apache.org/jira/browse/HIVE-5764 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Reporter: Vaibhav Gumashta Assignee: Xiaobing Zhou Fix For: 0.14.0 Currently a user needs to kill the process. Ideally there should be something like: hive --service metastore stop hive --service hiveserver2 stop -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6693) CASE with INT and BIGINT fail
[ https://issues.apache.org/jira/browse/HIVE-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-6693. - Resolution: Duplicate CASE with INT and BIGINT fail - Key: HIVE-6693 URL: https://issues.apache.org/jira/browse/HIVE-6693 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.12.0 Reporter: David Gayou CREATE TABLE testCase (n BIGINT) select case when (n 3) then n else 0 end from testCase fail with error : [Error 10016]: Line 1:36 Argument type mismatch '0': The expression after ELSE should have the same type as those after THEN: bigint is expected but int is found'. bigint and int should be more compatible, at least int should implictly cast to bigint. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
[ https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995846#comment-13995846 ] Sun Rui commented on HIVE-7012: --- For the issue about distinct, I will investigate it later and if I can find a real test case, I will submit a separate jira. Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer Key: HIVE-7012 URL: https://issues.apache.org/jira/browse/HIVE-7012 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Sun Rui Assignee: Navis Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt With HIVE 0.13.0, run the following test case: {code:sql} create table src(key bigint, value string); select count(distinct key) as col0 from src order by col0; {code} The following exception will be thrown: {noformat} java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Caused by: java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173) ... 14 more Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:reducesinkkey0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79) at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166) ... 14 more {noformat} This issue is related to HIVE-6455. When hive.optimize.reducededuplication is set to false, then this issue will be gone. Logical plan when hive.optimize.reducededuplication=false; {noformat} src TableScan (TS_0) alias: src Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Select Operator (SEL_1) expressions: key (type: bigint) outputColumnNames: key Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Group By Operator (GBY_2) aggregations: count(DISTINCT key) keys: key (type: bigint) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Reduce Output Operator (RS_3) istinctColumnIndices: key expressions: _col0 (type: bigint) DistributionKeys: 0 sort order: + OutputKeyColumnNames: _col0 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Group By Operator (GBY_4) aggregations: count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE Select Operator (SEL_5) expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator (RS_6) key expressions: _col0 (type: bigint)
Re: Hive and MR2
Any inputs? Sent from my iPhone5s On 2014年5月7日, at 18:11, Azuryy Yu azury...@gmail.com wrote: Hi, I am using hive-0.13.0 and hadoop-2.4.0, why I must set 'mapreduce.jobtracker.address' in yarn-site.xml? otherwise, there are exceptions and job failed. And, 'mapreduce.jobtracker.address' can be set to any value. The following messages are gened without set 'mapreduce.jobtracker.address'. Job output on the console: Execution log at: /tmp/test/test_20140507180505_bcd4d89f-017c-4cf4-81a3-5fa619de0ad0.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 1; number of reducers: 1 2014-05-07 18:06:25,782 null map = 0%, reduce = 0% 2014-05-07 18:06:33,699 null map = 100%, reduce = 0% 2014-05-07 18:06:34,774 null map = 0%, reduce = 0% 2014-05-07 18:06:49,222 null map = 100%, reduce = 100% Ended Job = job_1399453944131_0006 with errors Error during job, obtaining debugging information... Container error: 2014-05-07 18:06:33,634 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: file:/tmp/test/hive_2014-05-07_18-06-08_349_1526907284076641211-1/-mr-10001/0a1c9ebe-cdb0-4adc-9e93-8f176019f19a/map.xml 2014-05-07 18:06:33,635 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
[jira] [Created] (HIVE-7064) Remove Noop PTFs from FunctionRegistry and special handling within PTF translation
Harish Butani created HIVE-7064: --- Summary: Remove Noop PTFs from FunctionRegistry and special handling within PTF translation Key: HIVE-7064 URL: https://issues.apache.org/jira/browse/HIVE-7064 Project: Hive Issue Type: Bug Reporter: Harish Butani It is time to remove special handling of Noop PTFs from translation code. These should not be exposed as OOB functions. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Apache Hive 0.13.1
Hi folks, As an update, HIVE-6945 has some 0.13.1 specific test fixes appended which make its tests pass, the test that was failing with HIVE-6826 is now succeeding(flaky test), and Thejas has confirmed with me that the issue with HIVE-6846 is a test problem, not a product problem, relating to an incorrect expectation in the test. With those resolved, there are no more blockers, and no additional jiras that have been requested to be part of this release, so I'll go ahead and spin out RC0 now, and will also commit all those patches to the 0.13 branch. :) On Wed, May 7, 2014 at 7:22 PM, Sushanth Sowmyan khorg...@gmail.com wrote: After much experimentation with git bisect (which is very powerful), I've narrowed down the test failures reported yesterday. The failures are appearing from the following: HIVE-6945: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property HIVE-6846: org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs HIVE-6826: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 Of the above, the second jira was already in 0.13.0. I'll comment up on those jiras asking the committers involved in those bugs and to help debug the issue. If anyone is interested in the git bisect logs for these, they're available on http://people.apache.org/~khorgath/releases/0.13.1_RC0/test_failures/ On Tue, May 6, 2014 at 6:41 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Also, I wanted to throw in one more bit for those of you that are interested in tinkering along : http://people.apache.org/~khorgath/releases/0.13.1_RC0/relprep.pl http://people.apache.org/~khorgath/releases/0.13.1_RC0/requested_jiras This is the script and config file I'm using to generate this release. It's very much a hack right now, and I hope to improve it to streamline releases in the future, but how it can be used right now is this way: a) Put it in a hive git repo (and not have any changes that have not been committed - this script will checkout a new branch and commit things to that branch, so you want to make sure to have a clean repo) b) Put the file requested_jiras in that dir as well. c) Run the script from there. It will check the differences between the branch being released (branch-0.13 is hardcoded currently as a global), and looks at all the commit logs in trunk that correspond to the jiras requested in the requested_jiras file, sorts them in the order they were committed, and then checks out a new branch called relprep-branch-0.13-timestamp, and attempts to cherry-pick those commits in. For some patches, this will not work, so there is an override mechanism provided by entries in the requested_jiras file, as can be observed in the file I mention above. At the end of it, you'll have your 0.13.1 repo reproduction to test against if you so desire. Known Bugs : a) I use system() or die ...;, which is faulty in that the die code will never be reached. I need to fix this, but all the system calls were working for me, and I'd much rather focus on the release now, and improve this script later. This is a TODO b) Some patches (those generated with --no-prefix) don't work with older versions of git. You'll need a 1.8.x git for them, or you have to generate git patches without --no-prefix. On Tue, May 6, 2014 at 6:21 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Hi Folks, After a run of the ptest framework across the 0.13.1 codebase, we have a couple of test failures that I'm trying to track down and resolve. If any of you are interested in looking at it on your own in the meanwhile, the conglomerate patch of all the patches I'm forward porting into 0.13.1 is over at http://people.apache.org/~khorgath/releases/0.13.1_RC0/0.13.1.gdiff.patch The current tests that are failing are as follows: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs I'll update and follow up with patch devs as and when I find out the source for these errors. Thanks, -Sushanth On Mon, May 5, 2014 at 6:26 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Hi Folks, It's past 6pm PDT on May 5th 2014, so I'm beginning the process to generate the 0.13.1 RC0. I've received backport patches for
[jira] [Commented] (HIVE-7026) Support newly added role related APIs for v1 authorizer
[ https://issues.apache.org/jira/browse/HIVE-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993021#comment-13993021 ] Hive QA commented on HIVE-7026: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643702/HIVE-7026.1.patch.txt {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 5428 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_complex_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_keyword_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_roles org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_7 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_part org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_create org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_drop org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_show_role_principals_v1 org.apache.hadoop.hive.jdbc.TestJdbcDriver.testShowGrant org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/146/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/146/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643702 Support newly added role related APIs for v1 authorizer --- Key: HIVE-7026 URL: https://issues.apache.org/jira/browse/HIVE-7026 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7026.1.patch.txt, HIVE-7026.2.patch.txt Support SHOW_CURRENT_ROLE and SHOW_ROLE_PRINCIPALS for v1 authorizer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7058) Cleanup HiveHBase*InputFormat
Nick Dimiduk created HIVE-7058: -- Summary: Cleanup HiveHBase*InputFormat Key: HIVE-7058 URL: https://issues.apache.org/jira/browse/HIVE-7058 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Once HBase mapred API as support for providing a Scan instance, we should clean up the code around HBase InputFormats to make use of it and share common predicate pushdown logic. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS
[ https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7031: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Hari! Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS --- Key: HIVE-7031 URL: https://issues.apache.org/jira/browse/HIVE-7031 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-7031.1.patch This leads to inconsitent HDFS naming for empty partition/tables where a file might be named as hdfs://headnode0:9000/hive/scratch/hive_2 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating system -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Patch Available (was: Open) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5631) Index creation on a skew table fails
[ https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5631: Fix Version/s: (was: 0.13.0) 0.14.0 Index creation on a skew table fails Key: HIVE-5631 URL: https://issues.apache.org/jira/browse/HIVE-5631 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.14.0 Attachments: HIVE-5631.1.patch.txt, HIVE-5631.2.patch.txt, HIVE-5631.3.patch.txt REPRO STEPS: create database skewtest; use skewtest; create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH'); create index skew_indx on table skew (id) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; Last DDL fails with following error. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Invalid skew column [acct]) When creating a table, Hive has sanity tests to make sure the columns have proper names and the skewed columns are subset of the table columns. Here we fail because index table has skewed column info. Index tables's skewed columns include {acct} and the columns are {id, _bucketname, _offsets}. As the skewed column {acct} is not part of the table columns Hive throws the exception. The reason why Index table got skewed column info even though its definition has no such info is: When creating the index table a deep copy of the base table's StorageDescriptor (SD) (in this case 'skew') is made. And in that copied SD, index specific parameters are set and unrelated parameters are reset. Here skewed column info is not reset (there are few other params that are not reset). That's why the index table contains the skewed column info. Fix: Instead of deep copying the base table StorageDescriptor, create a new one from gathered info. This way it avoids the index table to inherit unnecessary properties in SD from base table. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found
[ https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993011#comment-13993011 ] Thejas M Nair commented on HIVE-7035: - +1 Templeton returns 500 for user errors - when job cannot be found Key: HIVE-7035 URL: https://issues.apache.org/jira/browse/HIVE-7035 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7035.patch curl -i 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman' should return HTTP Status code 4xx when no such job exists; it currently returns 500. {noformat} {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_201304291205_0015' doesn't exist in RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol PBServiceImpl.java:120)\r\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat org.apache.hado op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Serve r.java:2053)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat java.security.AccessController.doPrivileged(Native Method)\r\n\tat javax.security.auth.Subject.doAs(Subject.ja va:415)\r\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n} {noformat} NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7030) Remove hive.hadoop.classpath from hiveserver2.cmd
[ https://issues.apache.org/jira/browse/HIVE-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-7030: --- Resolution: Fixed Status: Resolved (was: Patch Available) Remove hive.hadoop.classpath from hiveserver2.cmd - Key: HIVE-7030 URL: https://issues.apache.org/jira/browse/HIVE-7030 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-7030.1.patch This parameter is not used anywhere and should be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/#review42539 --- ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java https://reviews.apache.org/r/18936/#comment76332 This is nice. But should have documentation for class and public methods ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java https://reviews.apache.org/r/18936/#comment76333 There has to be a more portable way to create temp file. ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java https://reviews.apache.org/r/18936/#comment76334 Can you make this a jira or drop if it's not important enough? ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java https://reviews.apache.org/r/18936/#comment76335 coding standards ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java https://reviews.apache.org/r/18936/#comment76337 why do you need this? this seems to do the same thing as tag == -1? ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java https://reviews.apache.org/r/18936/#comment76338 this should exist on the operator, but on the ReduceSinkDesc ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java https://reviews.apache.org/r/18936/#comment76340 needs asf header ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java https://reviews.apache.org/r/18936/#comment76341 can you please use curlies in this file. coding standard again. ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java https://reviews.apache.org/r/18936/#comment76342 same as before. todos shoudl be jiras or removed if not important ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java https://reviews.apache.org/r/18936/#comment76336 if debugenabled? - Gunther Hagleitner On May 1, 2014, 2:29 a.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated May 1, 2014, 2:29 a.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 conf/hive-default.xml.template 2552560 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 142bfd8 ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java f5d4670 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java d4be78d ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 3077d75 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java f7b499b ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 093da55
[jira] [Updated] (HIVE-6999) Add streaming mode to PTFs
[ https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6999: Attachment: HIVE-6999.3.patch fix show_functions.q.out diff Add streaming mode to PTFs -- Key: HIVE-6999 URL: https://issues.apache.org/jira/browse/HIVE-6999 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch, HIVE-6999.3.patch There are a set of use cases where the Table Function can operate on a Partition row by row or on a subset(window) of rows as it is being streamed to it. - Windowing has couple of use cases of this:processing of Rank functions, processing of Window Aggregations. - But this is a generic concept: any analysis that operates on an Ordered partition maybe able to operate in Streaming mode. This patch introduces streaming mode in PTFs and provides the mechanics to handle PTF chains that contain both modes of PTFs. Subsequent patches will introduce Streaming mode for Windowing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-7066 started by David Chen. hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148) at
[jira] [Updated] (HIVE-5908) Use map-join hint to cache intermediate result
[ https://issues.apache.org/jira/browse/HIVE-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5908: Fix Version/s: (was: 0.13.0) 0.14.0 Use map-join hint to cache intermediate result -- Key: HIVE-5908 URL: https://issues.apache.org/jira/browse/HIVE-5908 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: daxingyu Priority: Minor Labels: features Fix For: 0.14.0 Original Estimate: 72h Remaining Estimate: 72h There are very complicated query exists in our project , especially some intermediate result can be very small . But hive will treat these result as a part of mapreduce job , which is very costly. So i am proposed to use map-join hint to cache these small result, and speed up the hive job executions. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 21471: HIVE-7066: hive-exec jar is missing avro-mapred
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21471/ --- Review request for hive. Bugs: HIVE-7066 https://issues.apache.org/jira/browse/HIVE-7066 Repository: hive-git Description --- Restores the Avro core jar in the hive-exec jar. The hive-exec jar only contained avro-mapred but not core Avro, which caused the AvroSerDe to break. Diffs - ql/pom.xml 71daa26 Diff: https://reviews.apache.org/r/21471/diff/ Testing --- Confirmed that core Avro is now included in the hive-exec jar. Successfully ran sample query against table registered with the AvroSerDe. Thanks, David Chen
[jira] [Updated] (HIVE-7036) get_json_object bug when extract list of list with index
[ https://issues.apache.org/jira/browse/HIVE-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7036: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! get_json_object bug when extract list of list with index Key: HIVE-7036 URL: https://issues.apache.org/jira/browse/HIVE-7036 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.12.0, 0.13.0 Environment: all Reporter: Ming Ma Assignee: Navis Priority: Minor Labels: udf Fix For: 0.14.0 Attachments: HIVE-7036.1.patch.txt https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFJson.java#L250 this line should be out of the for-loop For example json = '{h:[1, [2, 3], {i: 0}, [{p: 11}, {p: 12}, {pp: 13}]}' get_json_object(json, '$.h[*][0]') should return back the first node(if exists) of every childrenof '$.h' which specifically should be [2,{p:11}] but hive returns only 2 because when hive pick the node '2' out, the tmp_jsonList will change to a list only contains one node '2': [2] then it was assigned to variable jsonList, in the next loop, value of i would be 2 which is greater than the size(always 1) of jsonList, then the loop broke out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6908) TestThriftBinaryCLIService.testExecuteStatementAsync has intermittent failures
[ https://issues.apache.org/jira/browse/HIVE-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995634#comment-13995634 ] Ashutosh Chauhan commented on HIVE-6908: +1 TestThriftBinaryCLIService.testExecuteStatementAsync has intermittent failures -- Key: HIVE-6908 URL: https://issues.apache.org/jira/browse/HIVE-6908 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6908.patch This has failed sometimes in the pre-commit tests. ThriftCLIServiceTest.testExecuteStatementAsync runs two statements. They are given 100 second timeout total, not sure if its by intention. As the first is a select query, it will take a majority of the time. The second statement (create table) should be quicker, but it fails sometimes because timeout is already mostly used up. The timeout should probably be reset after the first statement. If the operation finishes before the timeout, it wont have any effect as it'll break out. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998274#comment-13998274 ] David Chen commented on HIVE-7066: -- I have posted a patch for a fix. I have tested this on trunk by confirming that Avro core is in the hive-exec jar and successfully running a simple Hive query against a table registered with the AvroSerDe. The fix was a simple 1 line change. It looks like this issue was caused by the Ant - Maven switch and the avro core jar was inadvertently left out when creating the hive-exec jar. I am not able to create an RB right now because RB is giving me a 502 error when I try to create a new review request, both using {{rbt post}} and manually via the RB web UI. I will try to create a RB later. hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at
Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1
After upgrading Hadoop version, TestHive_7 is the only issue as explained below. RC looks good. On Tue, May 13, 2014 at 8:14 PM, Eugene Koifman ekoif...@hortonworks.comwrote: TestHive_7 is explained by https://issues.apache.org/jira/browse/HIVE-6521, which is in trunk but not 13.1 On Tue, May 13, 2014 at 6:50 PM, Eugene Koifman ekoif...@hortonworks.comwrote: I downloaded src tar, built it and ran webhcat e2e tests. I see 2 failures (which I don't see on trunk) TestHive_7 fails with got percentComplete map 100% reduce 0%, expected map 100% reduce 100% TestHeartbeat_1 fails to even launch the job. This looks like the root cause ERROR | 13 May 2014 18:24:00,394 | org.apache.hive.hcatalog.templeton.CatchallExceptionMapper | java.lang.NullPointerException at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:312) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:479) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:170) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:107) at org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:103) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hive.hcatalog.templeton.LauncherDelegator.queueAsUser(LauncherDelegator.java:103) at org.apache.hive.hcatalog.templeton.LauncherDelegator.enqueueController(LauncherDelegator.java:81) at org.apache.hive.hcatalog.templeton.JarDelegator.run(JarDelegator.java:55) at org.apache.hive.hcatalog.templeton.Server.mapReduceJar(Server.java:711) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:392) at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:87) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031) at
[jira] [Assigned] (HIVE-5733) Publish hive-exec artifact without all the dependencies
[ https://issues.apache.org/jira/browse/HIVE-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu reassigned HIVE-5733: - Assignee: Amareshwari Sriramadasu Publish hive-exec artifact without all the dependencies --- Key: HIVE-5733 URL: https://issues.apache.org/jira/browse/HIVE-5733 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Jarek Jarcec Cecho Assignee: Amareshwari Sriramadasu Currently the artifact {{hive-exec}} that is available in [maven|http://search.maven.org/remotecontent?filepath=org/apache/hive/hive-exec/0.12.0/hive-exec-0.12.0.jar] is shading all the dependencies (= the jar contains all Hive's dependencies). As other projects that are depending on Hive might be use slightly different version of the dependencies, it can easily happens that Hive's shaded version will be used instead which leads to very time consuming debugging of what is happening (for example SQOOP-1198). Would it be feasible publish {{hive-exec}} jar that will be build without shading any dependency? For example [avro-tools|http://search.maven.org/#artifactdetails%7Corg.apache.avro%7Cavro-tools%7C1.7.5%7Cjar] is having classifier nodeps that represents artifact without any dependencies. -- This message was sent by Atlassian JIRA (v6.2#6252)
Hive Error Log -Thanks for your help!
Hi~ When I run a hive statement(select * from lab.ec_web_log limit 100), I got an error. Should I do anything for fixing it? Thanks for your help! Lab.ec_web_log create statement: CREATE external TABLE lab.ec_web_log ( host STRING, ipaddress STRING, identd STRING, user STRING,finishtime STRING, requestline STRING, returncode INT, size INT, getstr STRING, retstatus INT, v_P03_1 STRING, v_P04 STRING, v_P06 STRING, v_P08 STRING, v_P09 STRING, v_P10 STRING, v_P11 STRING, v_P12 STRING, v_P13 STRING, v_P14 STRING, v_P15 STRING, v_P16 STRING, v_P17 STRING, v_P18 STRING, v_P19 STRING, v_P20 STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe' WITH SERDEPROPERTIES ( 'serialization.format'='org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol', 'quote.delim'='(|\\[|\\])', 'field.delim'=' ', 'serialization.null.format'='-') STORED AS TEXTFILE LOCATION '/user/audil/weblog/'; Web log format: xxx..com xxx.xxx.xxx.xxx - - [04/May/2014:23:59:59 +0800] 1 1248214 GET /buy/index.php?action=product_detailprod_no=P200382387prod_sort_uid=3304 HTTP/1.1 200 30975 202.39.48.37 - Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36 - Error List: 2014-05-14 13:55:07,751 WARN snappy.LoadSnappy (LoadSnappy.java:clinit(36)) - Snappy native library is available 2014-05-14 15:01:24,303 WARN mapred.JobClient (JobClient.java:copyAndConfigureFiles(746)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2014-05-14 15:42:09,652 ERROR exec.Task (SessionState.java:printError(410)) - Ended Job = job_201404092012_0138 with errors 2014-05-14 15:42:09,655 ERROR exec.Task (SessionState.java:printError(410)) - Error during job, obtaining debugging information... 2014-05-14 15:42:09,656 ERROR exec.Task (SessionState.java:printError(410)) - Job Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201404092012_0138 2014-05-14 15:42:09,659 ERROR exec.Task (SessionState.java:printError(410)) - Examining task ID: task_201404092012_0138_m_02 (and more) from job job_201404092012_0138 2014-05-14 15:42:09,878 ERROR exec.Task (SessionState.java:printError(410)) - Task with the most failures(4): - Task ID: task_201404092012_0138_m_00 URL: http://hdp001-jt:50030/taskdetails.jsp?jobid=job_201404092012_0138tipid=task_201404092012_0138_m_00 - Diagnostic Messages for this Task: Task attempt_201404092012_0138_m_00_3 failed to report status for 600 seconds. Killing! 2014-05-14 15:42:09,900 ERROR ql.Driver (SessionState.java:printError(410)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask 2014-05-14 15:56:30,759 ERROR ql.Driver (SessionState.java:printError(410)) - FAILED: ParseException line 1:0 cannot recognize input near 'conf' '.' 'set' org.apache.hadoop.hive.ql.parse.ParseException: line 1:0 cannot recognize input near 'conf' '.' 'set' at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:193) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) _DISCLAIMER : This message (and any attachments) may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee (or a person responsible for delivering it to the addressee). If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this message in error, please destroy the message or delete it from your system immediately and notify the sender.
[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Description: on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for \*hcatalog-core-\*.jar etc. In Pig 12.1 it's looking for hcatalog-core-\*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar: No such file or directory ls:
[jira] [Created] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results
Prasanth J created HIVE-7067: Summary: Min() and Max() on Timestamp and Date columns for ORC returns wrong results Key: HIVE-7067 URL: https://issues.apache.org/jira/browse/HIVE-7067 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J min() and max() of timestamp and date columns of ORC table returns wrong results. The reason for that is when ORC creates object inspectors for date and timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. When get() is performed on java primitive objects, a reference to the underlying object is returned whereas when get() is performed on writable objects, a copy of the underlying object is returned. Fix is to change the object inspector creation to return writable objects for timestamp and date. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table
[ https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997920#comment-13997920 ] Sushanth Sowmyan commented on HIVE-6473: Patch looks good to me. I'll try to kick off some tests on this myself. One more thing though - you remove hbase-handler/src/test/queries/positive/hbase_bulk.m in this patch, but you do not remove the corresponding hbase-handler/src/test/results/positive/hbase_bulk.m.out file. Could you add that removal as well? I'm +1 on it otherwise though, and will commit once we have a test run. Allow writing HFiles via HBaseStorageHandler table -- Key: HIVE-6473 URL: https://issues.apache.org/jira/browse/HIVE-6473 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, HIVE-6473.1.patch.txt Generating HFiles for bulkload into HBase could be more convenient. Right now we require the user to register a new table with the appropriate output format. This patch allows the exact same functionality, but through an existing table managed by the HBaseStorageHandler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results
[ https://issues.apache.org/jira/browse/HIVE-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7067: - Priority: Critical (was: Major) Min() and Max() on Timestamp and Date columns for ORC returns wrong results --- Key: HIVE-7067 URL: https://issues.apache.org/jira/browse/HIVE-7067 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Attachments: HIVE-7067.1.patch min() and max() of timestamp and date columns of ORC table returns wrong results. The reason for that is when ORC creates object inspectors for date and timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. When get() is performed on java primitive objects, a reference to the underlying object is returned whereas when get() is performed on writable objects, a copy of the underlying object is returned. Fix is to change the object inspector creation to return writable objects for timestamp and date. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 21289: HIVE-7033 : grant statements should check if the role exists
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21289/ --- (Updated May 9, 2014, 11:14 p.m.) Review request for hive and Ashutosh Chauhan. Changes --- HIVE-7033.2.patch - updating comment in .q file Bugs: HIVE-7033 https://issues.apache.org/jira/browse/HIVE-7033 Repository: hive-git Description --- The following grant statement that grants to a role that does not exist succeeds, but it should result in an error. grant all on t1 to role nosuchrole; Patch also fixes the handling of role names in some cases to be case insensitive. Diffs (updated) - metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 4b4f4f2 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrincipal.java 62b8994 ql/src/test/queries/clientnegative/authorization_role_grant_nosuchrole.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_table_grant_nosuchrole.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_1_sql_std.q 79ae17a ql/src/test/queries/clientpositive/authorization_role_grant1.q f89d0dc ql/src/test/queries/clientpositive/authorization_role_grant2.q 984d7ed ql/src/test/results/clientnegative/authorization_role_grant_nosuchrole.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_table_grant_nosuchrole.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 718ff31 ql/src/test/results/clientpositive/authorization_role_grant1.q.out 3c846eb ql/src/test/results/clientpositive/authorization_role_grant2.q.out 1e8f88a Diff: https://reviews.apache.org/r/21289/diff/ Testing --- New tests included Thanks, Thejas Nair
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6901: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Xuefu! Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6109.10.patch, HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.3.patch, HIVE-6901.4.patch, HIVE-6901.5.patch, HIVE-6901.6.patch, HIVE-6901.7.patch, HIVE-6901.8.patch, HIVE-6901.9.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS
[ https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993787#comment-13993787 ] Ashutosh Chauhan commented on HIVE-7031: +1 Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS --- Key: HIVE-7031 URL: https://issues.apache.org/jira/browse/HIVE-7031 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-7031.1.patch This leads to inconsitent HDFS naming for empty partition/tables where a file might be named as hdfs://headnode0:9000/hive/scratch/hive_2 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating system -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6438) Sort query result for test, removing order by clause
[ https://issues.apache.org/jira/browse/HIVE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992600#comment-13992600 ] Hive QA commented on HIVE-6438: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643697/HIVE-6438.4.patch.txt {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 5428 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join34 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union24 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/143/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/143/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643697 Sort query result for test, removing order by clause - Key: HIVE-6438 URL: https://issues.apache.org/jira/browse/HIVE-6438 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-6438.1.patch.txt, HIVE-6438.2.patch.txt, HIVE-6438.3.patch.txt, HIVE-6438.4.patch.txt For acquiring conformed output in various hadoop versions, most queries have order-by clause. If we support test declaration similar to SORT_BEFORE_DIFF which is for sorting output per query. We can remove order-by clauses and reduce the test time. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 21168: HIVE-6999: Add streaming mode to PTFs
On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java, line 58 https://reviews.apache.org/r/21168/diff/1/?file=576144#file576144line58 Can you add a comment why we need to keep track for first row processed in Map? done On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java, line 319 https://reviews.apache.org/r/21168/diff/1/?file=576144#file576144line319 Better name : outputPartRowsItr? done On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java, line 96 https://reviews.apache.org/r/21168/diff/1/?file=576148#file576148line96 Comment made sense. Since like those fields are not present in class any more. Shall we just get rid of this? this is needed; transient based on BeanInfo (get/set methods) in class On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java, line 218 https://reviews.apache.org/r/21168/diff/1/?file=576148#file576148line218 Better name: canAcceptInputAsStream? done On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java, line 449 https://reviews.apache.org/r/21168/diff/1/?file=576143#file576143line449 Instead of adding noop functions statically, we should put these functions in test/ package and than do add jar for testing. Multiple reasons: * Better to isolate test code from production code. * It will also exercise add jar functionality for PTF functions for which I am not sure we have coverage. * These functions also show up in default list of inbuilt functions. It may confuse user to wonder what good these functions are for. show_functions.q failed because of this. Agree with your comments on Noop. This was done because for testing we need a PTF and Noop has some special short circuit path for Partition handling. But can we do this as a separate Jira; removing references to Noop in the translation code is non trivial work. - Harish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21168/#review42578 --- On May 14, 2014, 9:21 p.m., Harish Butani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21168/ --- (Updated May 14, 2014, 9:21 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-6999 https://issues.apache.org/jira/browse/HIVE-6999 Repository: hive-git Description --- There are a set of use cases where the Table Function can operate on a Partition row by row or on a subset(window) of rows as it is being streamed to it. Windowing has couple of use cases of this:processing of Rank functions, processing of Window Aggregations. But this is a generic concept: any analysis that operates on an Ordered partition maybe able to operate in Streaming mode. This patch introduces streaming mode in PTFs and provides the mechanics to handle PTF chains that contain both modes of PTFs. Subsequent patches will introduce Streaming mode for Windowing. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 3bb8fa9 ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java 4d314b7 ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java 34aebf0 ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopStreaming.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopWithMapStreaming.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 1087bbf ql/src/test/queries/clientpositive/ptf_streaming.q PRE-CREATION ql/src/test/results/clientpositive/ptf_streaming.q.out PRE-CREATION Diff: https://reviews.apache.org/r/21168/diff/ Testing --- added new tests Thanks, Harish Butani
[jira] [Updated] (HIVE-6601) alter database commands should support schema synonym keyword
[ https://issues.apache.org/jira/browse/HIVE-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdelrahman Shettia updated HIVE-6601: -- Assignee: Abdelrahman Shettia alter database commands should support schema synonym keyword - Key: HIVE-6601 URL: https://issues.apache.org/jira/browse/HIVE-6601 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Abdelrahman Shettia It should be possible to use alter schema as an alternative to alter database. But the syntax is not currently supported. {code} alter schema db1 set owner user x; NoViableAltException(215@[]) FAILED: ParseException line 1:6 cannot recognize input near 'schema' 'db1' 'set' in alter statement {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7040) TCP KeepAlive for HiveServer2
Nicolas Thiébaud created HIVE-7040: -- Summary: TCP KeepAlive for HiveServer2 Key: HIVE-7040 URL: https://issues.apache.org/jira/browse/HIVE-7040 Project: Hive Issue Type: Improvement Components: Server Infrastructure Reporter: Nicolas Thiébaud Attachments: HIVE-7040.patch Implement TCP KeepAlive for HiverServer 2 to avoid half open connections. A setting could be added {code} property namehive.server2.tcp.keepalive/name valuetrue/value descriptionWhether to enable TCP keepalive for Hive Server 2/description /property {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7041) DoubleWritable/ByteWritable should extend their hadoop counterparts
Jason Dere created HIVE-7041: Summary: DoubleWritable/ByteWritable should extend their hadoop counterparts Key: HIVE-7041 URL: https://issues.apache.org/jira/browse/HIVE-7041 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7041.1.patch Hive has its own implementations of ByteWritable/DoubleWritable/ShortWritable. We cannot replace usage of these classes since they will break 3rd party UDFs/SerDes, however we can at least extend from the Hadoop version of these classes when possible to avoid duplicate code. When Hive finally moves to version 1.0 we might want to consider removing use of these Hive-specific writables and switching over to using the Hadoop version of these classes. ShortWritable didn't exist in Hadoop until 2.x so it looks like we can't do it with this class until 0.20/1.x support is dropped from Hive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6692) Location for new table or partition should be a write entity
[ https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995664#comment-13995664 ] Thejas M Nair commented on HIVE-6692: - [~navis] FYI, I will be working on making changes for SQL std auth to work with this soon (in a week or two). And then we can make the change in this jira without braking it. Location for new table or partition should be a write entity Key: HIVE-6692 URL: https://issues.apache.org/jira/browse/HIVE-6692 Project: Hive Issue Type: Task Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6692.1.patch.txt Locations for create table and alter table add partitionshould be write entities. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Apache Hive 0.13.1
Hi Folks, One more final hiccup before actually releasing the RC, I currently do not seem to be part of the hive unix group on apache that prevents me from being able to publish maven artifacts or add myself to the KEYS list. So as to not be blocked on the release process, however, I've generated the tarballs and signatures for Apache Hive 0.13.1 Release Candidate 0 here: https://people.apache.org/~khorgath/releases/0.13.1_RC0/artifacts/ Maven artifacts are currently not as of yet available, and you'll need to generate them from the source tarball above locally. Source tag for RC0 is at https://svn.apache.org/repos/asf/hive/tags/release-0.13.1-rc0/ I also put up my public key over at https://people.apache.org/~khorgath/releases/0.13.1_RC0/artifacts/khorgath.public_key in the meanwhile for verification purposes. Voting has not yet begun because I've still not yet released the maven artifacts, and the KEYS file has not yet been updated, so I have not formally called an official voting mail yet - I will do so as soon as I'm able. Please consider this an early preview for testing, I do not expect to change these files for RC0 itself. Thanks! -Sushanth On Thu, May 8, 2014 at 2:26 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Hi folks, As an update, HIVE-6945 has some 0.13.1 specific test fixes appended which make its tests pass, the test that was failing with HIVE-6826 is now succeeding(flaky test), and Thejas has confirmed with me that the issue with HIVE-6846 is a test problem, not a product problem, relating to an incorrect expectation in the test. With those resolved, there are no more blockers, and no additional jiras that have been requested to be part of this release, so I'll go ahead and spin out RC0 now, and will also commit all those patches to the 0.13 branch. :) On Wed, May 7, 2014 at 7:22 PM, Sushanth Sowmyan khorg...@gmail.com wrote: After much experimentation with git bisect (which is very powerful), I've narrowed down the test failures reported yesterday. The failures are appearing from the following: HIVE-6945: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property HIVE-6846: org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs HIVE-6826: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 Of the above, the second jira was already in 0.13.0. I'll comment up on those jiras asking the committers involved in those bugs and to help debug the issue. If anyone is interested in the git bisect logs for these, they're available on http://people.apache.org/~khorgath/releases/0.13.1_RC0/test_failures/ On Tue, May 6, 2014 at 6:41 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Also, I wanted to throw in one more bit for those of you that are interested in tinkering along : http://people.apache.org/~khorgath/releases/0.13.1_RC0/relprep.pl http://people.apache.org/~khorgath/releases/0.13.1_RC0/requested_jiras This is the script and config file I'm using to generate this release. It's very much a hack right now, and I hope to improve it to streamline releases in the future, but how it can be used right now is this way: a) Put it in a hive git repo (and not have any changes that have not been committed - this script will checkout a new branch and commit things to that branch, so you want to make sure to have a clean repo) b) Put the file requested_jiras in that dir as well. c) Run the script from there. It will check the differences between the branch being released (branch-0.13 is hardcoded currently as a global), and looks at all the commit logs in trunk that correspond to the jiras requested in the requested_jiras file, sorts them in the order they were committed, and then checks out a new branch called relprep-branch-0.13-timestamp, and attempts to cherry-pick those commits in. For some patches, this will not work, so there is an override mechanism provided by entries in the requested_jiras file, as can be observed in the file I mention above. At the end of it, you'll have your 0.13.1 repo reproduction to test against if you so desire. Known Bugs : a) I use system() or die ...;, which is faulty in that the die code will never be reached. I need to fix this, but all the system calls were working for me, and I'd much rather focus on the release now, and improve this script later. This is a TODO b) Some patches (those generated with --no-prefix) don't work with older versions of git. You'll need a 1.8.x git for them, or you have to generate git patches without --no-prefix. On Tue, May 6, 2014 at 6:21 PM, Sushanth Sowmyan khorg...@gmail.com
[jira] [Updated] (HIVE-6999) Add streaming mode to PTFs
[ https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6999: Status: Open (was: Patch Available) Add streaming mode to PTFs -- Key: HIVE-6999 URL: https://issues.apache.org/jira/browse/HIVE-6999 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0, 0.12.0, 0.11.0 Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch, HIVE-6999.3.patch There are a set of use cases where the Table Function can operate on a Partition row by row or on a subset(window) of rows as it is being streamed to it. - Windowing has couple of use cases of this:processing of Rank functions, processing of Window Aggregations. - But this is a generic concept: any analysis that operates on an Ordered partition maybe able to operate in Streaming mode. This patch introduces streaming mode in PTFs and provides the mechanics to handle PTF chains that contain both modes of PTFs. Subsequent patches will introduce Streaming mode for Windowing. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 21168: HIVE-6999: Add streaming mode to PTFs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21168/ --- (Updated May 14, 2014, 9:21 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-6999 https://issues.apache.org/jira/browse/HIVE-6999 Repository: hive-git Description --- There are a set of use cases where the Table Function can operate on a Partition row by row or on a subset(window) of rows as it is being streamed to it. Windowing has couple of use cases of this:processing of Rank functions, processing of Window Aggregations. But this is a generic concept: any analysis that operates on an Ordered partition maybe able to operate in Streaming mode. This patch introduces streaming mode in PTFs and provides the mechanics to handle PTF chains that contain both modes of PTFs. Subsequent patches will introduce Streaming mode for Windowing. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 3bb8fa9 ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java 4d314b7 ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java 34aebf0 ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopStreaming.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopWithMapStreaming.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 1087bbf ql/src/test/queries/clientpositive/ptf_streaming.q PRE-CREATION ql/src/test/results/clientpositive/ptf_streaming.q.out PRE-CREATION Diff: https://reviews.apache.org/r/21168/diff/ Testing --- added new tests Thanks, Harish Butani
[jira] [Commented] (HIVE-6999) Add streaming mode to PTFs
[ https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998255#comment-13998255 ] Ashutosh Chauhan commented on HIVE-6999: +1 Add streaming mode to PTFs -- Key: HIVE-6999 URL: https://issues.apache.org/jira/browse/HIVE-6999 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch There are a set of use cases where the Table Function can operate on a Partition row by row or on a subset(window) of rows as it is being streamed to it. - Windowing has couple of use cases of this:processing of Rank functions, processing of Window Aggregations. - But this is a generic concept: any analysis that operates on an Ordered partition maybe able to operate in Streaming mode. This patch introduces streaming mode in PTFs and provides the mechanics to handle PTF chains that contain both modes of PTFs. Subsequent patches will introduce Streaming mode for Windowing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5538) Turn on vectorization by default.
[ https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5538: --- Status: Patch Available (was: Open) Turn on vectorization by default. - Key: HIVE-5538 URL: https://issues.apache.org/jira/browse/HIVE-5538 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch, HIVE-5538.4.patch, HIVE-5538.5.patch Vectorization should be turned on by default, so that users don't have to specifically enable vectorization. Vectorization code validates and ensures that a query falls back to row mode if it is not supported on vectorized code path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-7066 started by David Chen. hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148) at
[jira] [Updated] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-7066: - Attachment: HIVE-7066.1.patch hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476) at
[jira] [Commented] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998278#comment-13998278 ] David Chen commented on HIVE-7066: -- RB: https://reviews.apache.org/r/21471 hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Attachments: HIVE-7066.1.patch Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at
[jira] [Updated] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results
[ https://issues.apache.org/jira/browse/HIVE-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7067: - Attachment: HIVE-7067.1.patch Min() and Max() on Timestamp and Date columns for ORC returns wrong results --- Key: HIVE-7067 URL: https://issues.apache.org/jira/browse/HIVE-7067 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7067.1.patch min() and max() of timestamp and date columns of ORC table returns wrong results. The reason for that is when ORC creates object inspectors for date and timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. When get() is performed on java primitive objects, a reference to the underlying object is returned whereas when get() is performed on writable objects, a copy of the underlying object is returned. Fix is to change the object inspector creation to return writable objects for timestamp and date. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6207: Fix Version/s: (was: 0.13.0) 0.14.0 Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.14.0 HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
On May 9, 2014, 1:58 a.m., Gunther Hagleitner wrote: serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java, line 147 https://reviews.apache.org/r/18936/diff/13/?file=572150#file572150line147 randomaccess doesn't extend output? no - Sergey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/#review42555 --- On May 1, 2014, 2:29 a.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated May 1, 2014, 2:29 a.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 conf/hive-default.xml.template 2552560 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 142bfd8 ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java f5d4670 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java d4be78d ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 3077d75 ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java f7b499b ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 093da55 ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 ql/src/test/queries/clientpositive/tez_union.q f80d94c ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 8350670 ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java 9079b9d serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java 1b09d41 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 5870884 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java bab505e serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 6f344bb serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java a99c7b4 serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 435d6c6 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 82c1263 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java b188c3f serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java caf3517
[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
[ https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6862: --- Fix Version/s: 0.13.1 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server -- Key: HIVE-6862 URL: https://issues.apache.org/jira/browse/HIVE-6862 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0, 0.13.1 Attachments: HIVE-6862.2.patch, HIVE-6862.3.patch, HIVE-6862.patch need to add a unifed 0.13 script and a separate script for ACID support NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6919) hive sql std auth select query fails on partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6919: --- Fix Version/s: 0.13.1 hive sql std auth select query fails on partitioned tables -- Key: HIVE-6919 URL: https://issues.apache.org/jira/browse/HIVE-6919 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0, 0.13.1 Attachments: HIVE-6919.1.patch {code} analyze table studentparttab30k partition (ds) compute statistics; Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied. Principal [name=hadoopqa, type=USER] does not have following privileges on Object [type=PARTITION, name=null] : [SELECT] (state=42000,code=4) {code} Sql std auth is supposed to ignore partition level objects for privilege checks, but that is not working as intended. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6893) out of sequence error in HiveMetastore server
[ https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6893: Fix Version/s: (was: 0.13.0) 0.14.0 out of sequence error in HiveMetastore server - Key: HIVE-6893 URL: https://issues.apache.org/jira/browse/HIVE-6893 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Romain Rigaux Assignee: Naveen Gangam Fix For: 0.14.0 Attachments: HIVE-6893.1.patch Calls listing databases or tables fail. It seems to be a concurrency problem. {code} 014-03-06 05:34:00,785 ERROR hive.log: org.apache.thrift.TApplicationException: get_databases failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278) at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57) at com.sun.proxy.$Proxy9.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7037) Add additional tests for transform clauses with Tez
[ https://issues.apache.org/jira/browse/HIVE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-7037: - Status: Patch Available (was: Open) Add additional tests for transform clauses with Tez --- Key: HIVE-7037 URL: https://issues.apache.org/jira/browse/HIVE-7037 Project: Hive Issue Type: Bug Components: Tez Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-7037.1.patch Enabling some q tests for Tez wrt to ScriptOperator/Stream/Transform. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19984: Beeline should accept -i option to Initializing a SQL file
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19984/#review42621 --- Can you also include a unit test for this ? It can go into TestBeeLineWithArgs.java - Thejas Nair On May 7, 2014, 4:10 a.m., Navis Ryu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19984/ --- (Updated May 7, 2014, 4:10 a.m.) Review request for hive. Bugs: HIVE-6561 https://issues.apache.org/jira/browse/HIVE-6561 Repository: hive-git Description --- Hive CLI has -i option. From Hive CLI help: {code} ... -i filenameInitialization SQL file ... {code} However, Beeline has no such option: {code} xzhang@xzlt:~/apa/hive3$ ./packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/bin/beeline -u jdbc:hive2:// -i hive.rc ... Connected to: Apache Hive (version 0.14.0-SNAPSHOT) Driver: Hive JDBC (version 0.14.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ -i (No such file or directory) Property url is required Beeline version 0.14.0-SNAPSHOT by Apache Hive ... {code} Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java 5773109 beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 44cabdf beeline/src/java/org/apache/hive/beeline/Commands.java 493f963 beeline/src/main/resources/BeeLine.properties 697c29a Diff: https://reviews.apache.org/r/19984/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-7000) Several issues with javadoc generation
[ https://issues.apache.org/jira/browse/HIVE-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7000: --- Status: Patch Available (was: Open) Several issues with javadoc generation -- Key: HIVE-7000 URL: https://issues.apache.org/jira/browse/HIVE-7000 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7000.1.patch 1. Ran 'mvn javadoc:javadoc -Phadoop-2'. Encountered several issues - Generated classes are included in the javadoc - generation fails in the top level hcatalog folder because its src folder contains no java files. Patch attached to fix these issues. 2. Tried mvn javadoc:aggregate -Phadoop-2 - cannot get an aggregated javadoc for all of hive - tried setting 'aggregate' parameter to true. Didn't work There are several questions in StackOverflow about multiple project javadoc. Seems like this is broken. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5342) Remove pre hadoop-0.20.0 related codes
[ https://issues.apache.org/jira/browse/HIVE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5342: - Status: Patch Available (was: Open) Remove pre hadoop-0.20.0 related codes -- Key: HIVE-5342 URL: https://issues.apache.org/jira/browse/HIVE-5342 Project: Hive Issue Type: Task Reporter: Navis Assignee: Jason Dere Priority: Trivial Attachments: D13047.1.patch, HIVE-5342.1.patch, HIVE-5342.2.patch Recently, we discussed not supporting hadoop-0.20.0. If it would be done like that or not, 0.17 related codes would be removed before that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable
[ https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997714#comment-13997714 ] Xuefu Zhang commented on HIVE-7049: --- [~kamrul] If Hive can support the AVRO schema resolutions you mentioned, I don't see any obstacles. However, the fix in your patch seems having a problem with decimal, which may need more deliberation. Unable to deserialize AVRO data when file schema and record schema are different and nullable - Key: HIVE-7049 URL: https://issues.apache.org/jira/browse/HIVE-7049 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-7049.1.patch It mainly happens when 1 )file schema and record schema are not same 2 ) Record schema is nullable but file schema is not. The potential code location is at class AvroDeserialize {noformat} if(AvroSerdeUtils.isNullableType(recordSchema)) { return deserializeNullableUnion(datum, fileSchema, recordSchema, columnType); } {noformat} In the above code snippet, recordSchema is verified if it is nullable. But the file schema is not checked. I tested with these values: {noformat} recordSchema= [null,string] fielSchema= string {noformat} And i got the following exception line numbers might not be the same due to mu debugged code version. {noformat} org.apache.avro.AvroRuntimeException: Not a union: string at org.apache.avro.Schema.getTypes(Schema.java:272) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6374) Hive job submitted with non-default name node (fs.default.name) doesn't process locations properly
[ https://issues.apache.org/jira/browse/HIVE-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6374: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Benjamin! Hive job submitted with non-default name node (fs.default.name) doesn't process locations properly --- Key: HIVE-6374 URL: https://issues.apache.org/jira/browse/HIVE-6374 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: Any Reporter: Benjamin Zhitomirsky Assignee: Benjamin Zhitomirsky Fix For: 0.14.0 Attachments: Design of the fix HIVE-6374.docx, hive-6374.1.patch, hive-6374.3.patch, hive-6374.patch Original Estimate: 168h Remaining Estimate: 168h Create table/index/database and add partition DDL doesn't work properly if all following conditions are true: - Metastore service is used - fs.default.name is specified and it differs from the default one - Location is not specified or specified as a not fully qualified URI The root cause of this behavior is that Hive client doesn't pass configuration context to the metastore services which tries to resolve the paths. The fix is it too resolve the path in the Hive client if fs.default.name is specified and it differs from the default one (it is must easier then start passing the context, which would be a major change). The CR will submitted shortly after tests are done -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-6938: --- Attachment: HIVE-6938.2.patch Reuploading the exact same patch to trigger precommits. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
[ https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7012: --- Assignee: Navis Status: Open (was: Patch Available) reduce_deduplicate_extended.q, ppd.q, fetch_aggregation.q failures might be relevant. [~navis] can you take a look? Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer Key: HIVE-7012 URL: https://issues.apache.org/jira/browse/HIVE-7012 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Sun Rui Assignee: Navis Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt With HIVE 0.13.0, run the following test case: {code:sql} create table src(key bigint, value string); select count(distinct key) as col0 from src order by col0; {code} The following exception will be thrown: {noformat} java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 9 more Caused by: java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173) ... 14 more Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:reducesinkkey0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79) at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166) ... 14 more {noformat} This issue is related to HIVE-6455. When hive.optimize.reducededuplication is set to false, then this issue will be gone. Logical plan when hive.optimize.reducededuplication=false; {noformat} src TableScan (TS_0) alias: src Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Select Operator (SEL_1) expressions: key (type: bigint) outputColumnNames: key Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Group By Operator (GBY_2) aggregations: count(DISTINCT key) keys: key (type: bigint) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Reduce Output Operator (RS_3) istinctColumnIndices: key expressions: _col0 (type: bigint) DistributionKeys: 0 sort order: + OutputKeyColumnNames: _col0 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Group By Operator (GBY_4) aggregations: count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE Select Operator (SEL_5) expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator (RS_6) key expressions: _col0
[jira] [Created] (HIVE-7063) Optimize for the Top N within a Group use case
Harish Butani created HIVE-7063: --- Summary: Optimize for the Top N within a Group use case Key: HIVE-7063 URL: https://issues.apache.org/jira/browse/HIVE-7063 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani It is common to rank within a Group/Partition and then only return the Top N entries within each Group. With Streaming mode for Windowing, we should push the post filter on the rank into the Windowing processing as a Limit expression. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6976) Show query id only when there's jobs on the cluster
[ https://issues.apache.org/jira/browse/HIVE-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6976: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the review Sergey! Show query id only when there's jobs on the cluster --- Key: HIVE-6976 URL: https://issues.apache.org/jira/browse/HIVE-6976 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6976.1.patch No need to print the query id for local-only execution. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995623#comment-13995623 ] Hive QA commented on HIVE-6187: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12644392/HIVE-6187.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5504 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/178/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/178/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12644392 Add test to verify that DESCRIBE TABLE works with quoted table names Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5810) create a function add_date as exists in mysql
[ https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-5810: - Status: Open (was: Patch Available) create a function add_date as exists in mysql Key: HIVE-5810 URL: https://issues.apache.org/jira/browse/HIVE-5810 Project: Hive Issue Type: Improvement Reporter: Anandha L Ranganathan Assignee: Anandha L Ranganathan Attachments: HIVE-5810.2.patch, HIVE-5810.patch Original Estimate: 40h Remaining Estimate: 40h MySQL has ADDDATE(date,INTERVAL expr unit). Similarly in Hive we can have (date,unit,expr). Here Unit is DAY/Month/Year For example, add_date('2013-11-09','DAY',2) will return 2013-11-11. add_date('2013-11-09','Month',2) will return 2014-01-09. add_date('2013-11-09','Year',2) will return 2014-11-11. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7061) sql std auth - insert queries without overwrite should not require delete privileges
[ https://issues.apache.org/jira/browse/HIVE-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998147#comment-13998147 ] Thejas M Nair commented on HIVE-7061: - WriteEntity types are already in hive, as part of HIVE-5843. sql std auth - insert queries without overwrite should not require delete privileges Key: HIVE-7061 URL: https://issues.apache.org/jira/browse/HIVE-7061 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Insert queries can do the equivalent of delete and insert of all rows of a table or partition, if the overwrite keyword is used. As a result DELETE privilege is applicable to such queries. However, SQL Standard auth requires DELETE privilege even for queries that don't have the overwrite keyword. -- This message was sent by Atlassian JIRA (v6.2#6252)
Hive and MR2
Hi, I am using hive-0.13.0 and hadoop-2.4.0, why I must set 'mapreduce.jobtracker.address' in yarn-site.xml? otherwise, there are exceptions and job failed. And, 'mapreduce.jobtracker.address' can be set to any value. The following messages are gened without set 'mapreduce.jobtracker.address'. Job output on the console: Execution log at: /tmp/test/test_20140507180505_bcd4d89f-017c-4cf4-81a3-5fa619de0ad0.log Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 1; number of reducers: 1 2014-05-07 18:06:25,782 null map = 0%, reduce = 0% 2014-05-07 18:06:33,699 null map = 100%, reduce = 0% 2014-05-07 18:06:34,774 null map = 0%, reduce = 0% 2014-05-07 18:06:49,222 null map = 100%, reduce = 100% Ended Job = job_1399453944131_0006 with errors Error during job, obtaining debugging information... Container error: 2014-05-07 18:06:33,634 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: No plan file found: file:/tmp/test/hive_2014-05-07_18-06-08_349_1526907284076641211-1/-mr-10001/0a1c9ebe-cdb0-4adc-9e93-8f176019f19a/map.xml 2014-05-07 18:06:33,635 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
[jira] [Created] (HIVE-7047) Support schema keyword in alter database statements
Thejas M Nair created HIVE-7047: --- Summary: Support schema keyword in alter database statements Key: HIVE-7047 URL: https://issues.apache.org/jira/browse/HIVE-7047 Project: Hive Issue Type: Bug Components: Database/Schema, SQL Affects Versions: 0.13.0 Reporter: Thejas M Nair To be consistent with rest of the syntax, the alter database statements should also support SCHEMA keyword along with DATABASE keyword. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated HIVE-7054: - Status: Patch Available (was: Open) Thanks [~rusanu] for quick review on the review board! Attached patch attempts to incorporate that. Also minor update to the qtest results output file. Support ELT UDF in vectorized mode -- Key: HIVE-7054 URL: https://issues.apache.org/jira/browse/HIVE-7054 Project: Hive Issue Type: New Feature Components: Vectorization Affects Versions: 0.14.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7054.2.patch, HIVE-7054.patch Implement support for ELT udf in vectorized execution mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work stopped] (HIVE-7066) hive-exec jar is missing avro-mapred
[ https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-7066 stopped by David Chen. hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Assignee: David Chen Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148) at
[jira] [Created] (HIVE-7068) Integrate AccumuloStorageHandler
Josh Elser created HIVE-7068: Summary: Integrate AccumuloStorageHandler Key: HIVE-7068 URL: https://issues.apache.org/jira/browse/HIVE-7068 Project: Hive Issue Type: New Feature Reporter: Josh Elser [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to HBase. Some [initial work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done to support querying an Accumulo table using Hive already. It is not a complete solution as, most notably, the current implementation presently lacks support for INSERTs. I would like to polish up the AccumuloStorageHandler (presently based on 0.10), implement missing basic functionality and compare it to the HBaseStorageHandler (to ensure that we follow the same general usage patterns). I've also been in communication with [~bfem] (the initial author) who expressed interest in working on this again. I hope to coordinate efforts with him. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
[ https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7050: - Attachment: HIVE-7050.3.patch Addressed [~xuefuz]'s review comments. Left reply in RB. RB is flaky now will update the patch in RB later. Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE - Key: HIVE-7050 URL: https://issues.apache.org/jira/browse/HIVE-7050 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch, HIVE-7050.3.patch There is currently no way to display the column level stats from hive CLI. It will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable
[ https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998371#comment-13998371 ] Mohammad Kamrul Islam commented on HIVE-7049: - Thanks @xzhang. However, the fix in your patch seems having a problem with decimal, which may need more deliberation. What is the (potential) problem in decimal? Any proposal what to do to address the decimal problem? Unable to deserialize AVRO data when file schema and record schema are different and nullable - Key: HIVE-7049 URL: https://issues.apache.org/jira/browse/HIVE-7049 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-7049.1.patch It mainly happens when 1 )file schema and record schema are not same 2 ) Record schema is nullable but file schema is not. The potential code location is at class AvroDeserialize {noformat} if(AvroSerdeUtils.isNullableType(recordSchema)) { return deserializeNullableUnion(datum, fileSchema, recordSchema, columnType); } {noformat} In the above code snippet, recordSchema is verified if it is nullable. But the file schema is not checked. I tested with these values: {noformat} recordSchema= [null,string] fielSchema= string {noformat} And i got the following exception line numbers might not be the same due to mu debugged code version. {noformat} org.apache.avro.AvroRuntimeException: Not a union: string at org.apache.avro.Schema.getTypes(Schema.java:272) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7063) Optimize for the Top N within a Group use case
[ https://issues.apache.org/jira/browse/HIVE-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998389#comment-13998389 ] Gopal V commented on HIVE-7063: --- This would be exceptionally useful - I have seen at least two implementations of TOPN UDAFs for this. Optimize for the Top N within a Group use case -- Key: HIVE-7063 URL: https://issues.apache.org/jira/browse/HIVE-7063 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani It is common to rank within a Group/Partition and then only return the Top N entries within each Group. With Streaming mode for Windowing, we should push the post filter on the rank into the Windowing processing as a Limit expression. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6994) parquet-hive createArray strips null elements
[ https://issues.apache.org/jira/browse/HIVE-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Coffey updated HIVE-6994: Attachment: HIVE-6994.3.patch The failed tests are unrelated to the patch--submitting a rebased against the trunk and retested patch. [~szehon], new rb link here: https://reviews.apache.org/r/21430/ hope we're good :) parquet-hive createArray strips null elements - Key: HIVE-6994 URL: https://issues.apache.org/jira/browse/HIVE-6994 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Justin Coffey Assignee: Justin Coffey Fix For: 0.14.0 Attachments: HIVE-6994-1.patch, HIVE-6994.2.patch, HIVE-6994.3.patch, HIVE-6994.patch The createArray method in ParquetHiveSerDe strips null values from resultant ArrayWritables. tracked here as well: https://github.com/Parquet/parquet-mr/issues/377 -- This message was sent by Atlassian JIRA (v6.2#6252)