[jira] [Created] (HIVE-11591) change thrift generation to use undated annotations
Sergey Shelukhin created HIVE-11591: --- Summary: change thrift generation to use undated annotations Key: HIVE-11591 URL: https://issues.apache.org/jira/browse/HIVE-11591 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Thrift has added class annotations to generated classes; these contain generation date. Because of this, all the Java thrift files change on every re-gen, even if you only make a small change that should not affect bazillion files. This depends on upgrading to Thrift 0.9.3, which doesn't exist yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 35792: HIVE-10438 - Architecture for ResultSet Compression via external plugin
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35792/ --- (Updated Aug. 17, 2015, 10:37 p.m.) Review request for hive, Vaibhav Gumashta, Xiaojian Wang, Xiao Meng, and Xuefu Zhang. Changes --- Reverted two .gitignore files. Repository: hive-git Description --- This patch enables ResultSet compression for Hive using external plugins. The patch proposes a plugin architecture that enables using external plugins to compress ResultSets on-the-fly. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 730f5be jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java bb2b695 service/if/TCLIService.thrift baf583f service/src/gen/thrift/gen-cpp/TCLIService.h 29a9f4a service/src/gen/thrift/gen-cpp/TCLIService_types.h 4536b41 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 742cfdc service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TEnColumn.java PRE-CREATION service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java feaed34 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java 805e69f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java 657f868 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java 48f4b45 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java 6e714c6 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java cc1a148 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java 1cd7980 service/src/gen/thrift/gen-py/TCLIService/ttypes.py efee8ef service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb bfb2b69 service/src/java/org/apache/hive/service/cli/Column.java 2e21f18 service/src/java/org/apache/hive/service/cli/ColumnBasedSet.java 47a582e service/src/java/org/apache/hive/service/cli/RowSetFactory.java e8f68ea service/src/java/org/apache/hive/service/cli/compression/ColumnCompressor.java PRE-CREATION service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java PRE-CREATION service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java PRE-CREATION service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 67bc778 service/src/test/org/apache/hive/service/cli/compression/SnappyIntCompressor.java PRE-CREATION service/src/test/org/apache/hive/service/cli/compression/TestEncodedColumnBasedSet.java PRE-CREATION service/src/test/resources/META-INF/services/org.apache.hive.service.cli.compression.ColumnCompressor PRE-CREATION Diff: https://reviews.apache.org/r/35792/diff/ Testing --- Testing has been done using a docker container-based query submitter that has an integer decompressor as part of it. Using the integer compressor (also provided) and the decompressor, the end-to-end functionality can be observed. File Attachments Patch file https://reviews.apache.org/media/uploaded/files/2015/06/23/16aa08f8-2393-460a-83ef-72464fc537db__HIVE-10438.patch Thanks, Rohit Dholakia
[jira] [Created] (HIVE-11588) merge master into branch
Sergey Shelukhin created HIVE-11588: --- Summary: merge master into branch Key: HIVE-11588 URL: https://issues.apache.org/jira/browse/HIVE-11588 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: hbase-metastore-branch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11590) AvroDeserializer is very chatty
Swarnim Kulkarni created HIVE-11590: --- Summary: AvroDeserializer is very chatty Key: HIVE-11590 URL: https://issues.apache.org/jira/browse/HIVE-11590 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Swarnim Kulkarni It seems like AvroDeserializer is currently very chatty with it logging tons of messages at INFO level in the mapreduce logs. It would be helpful to push down some of these to debug level to keep the logs clean. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11592) ORC metadata section can sometimes exceed protobuf message size limit
Prasanth Jayachandran created HIVE-11592: Summary: ORC metadata section can sometimes exceed protobuf message size limit Key: HIVE-11592 URL: https://issues.apache.org/jira/browse/HIVE-11592 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran If there are too many small stripes and with many columns, the overhead for storing metadata (column stats) can exceed the default protobuf message size of 64MB. Reading such files will throw the following exception {code} Exception in thread main com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) at com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) at com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.init(OrcProto.java:1331) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.init(OrcProto.java:1281) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1374) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1369) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.init(OrcProto.java:4887) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.init(OrcProto.java:4803) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4990) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4985) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.init(OrcProto.java:12925) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.init(OrcProto.java:12872) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12961) at org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12956) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.init(OrcProto.java:13599) at org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.init(OrcProto.java:13546) at org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13635) at org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13630) at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) at org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.parseFrom(OrcProto.java:13746) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl$MetaInfoObjExtractor.init(ReaderImpl.java:468) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.init(ReaderImpl.java:314) at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:228) at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} The only solution for this is to programmatically increase the CodeInputStream size limit. We should make this configurable via hive config so that the orc file is readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11589) Invalid value such as '-1' should be checked for 'hive.txn.timeout'.
Takahiko Saito created HIVE-11589: - Summary: Invalid value such as '-1' should be checked for 'hive.txn.timeout'. Key: HIVE-11589 URL: https://issues.apache.org/jira/browse/HIVE-11589 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.2.1 Reporter: Takahiko Saito Assignee: Eugene Koifman Priority: Minor When an user accidentally set an invalid value such as '-1' for 'hive.txn.timeout', the query simply fails throwing 'NoSuchLockException' {noformat} 2015-08-16 23:25:43,149 ERROR [HiveServer2-Background-Pool: Thread-206]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(159)) - NoSuchLockException(message:No such lock: 40) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1710) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:501) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:5571) at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy7.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1876) at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy8.unlock(Unknown Source) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:134) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.releaseLocks(DbLockManager.java:153) at org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1038) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1208) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} The better way to handle such an invalid value is to check the value before it throws NoSuchLockException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 37563: HIVE-10697 Fix for ObjectInspectorConvertors#UnionConvertor doing a faulty conversion
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37563/ --- Review request for hive and Hari Sankar Sivarama Subramaniyan. Repository: hive-git Description --- HIVE-10697 Fix for ObjectInspectorConvertors#UnionConvertor doing a faulty conversion Diffs - serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java 8ef8ce1736d50f0f9163cd5e3fd00ddd4bd810da serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java a64aee074d05a14c3c72079ff960039811936419 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java d1b11e82730e57f6894145478aae7c0c0c26e518 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java 11852833577149152cefaef50ee49328733c9dde Diff: https://reviews.apache.org/r/37563/diff/ Testing --- Unit tests added. Thanks, Swarnim Kulkarni
Review Request 37529: HIVE-11383
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37529/ --- Review request for hive. Bugs: HIVE-11383 https://issues.apache.org/jira/browse/HIVE-11383 Repository: hive-git Description --- HIVE-11383 Diffs - pom.xml 15c280561fa1b1382c90de549a6f669088b0a524 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 5a5954dd984b0a0a00080d25c6c5dfe512f56141 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveSort.java 18d283824a02594a74d4c10192a583d496b25dd4 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinProjectTransposeRule.java fd8f5cb6d56d99c8a7d5215f39fe06fd6069c241 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java 29deed9ffe866ba343444a62c133da7fc2b13bf1 ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java f26d1dfd59598f9a0e29514124ef0646fa40be50 ql/src/test/results/clientpositive/auto_join13.q.out 952dbf885665f8b23ce563c6811de0e2234d6b9e ql/src/test/results/clientpositive/filter_cond_pushdown.q.out e09057a7f2904e14b7d301997aac871350501313 ql/src/test/results/clientpositive/join13.q.out 3b921b99d9fc603c983d59b8987a36700499dec9 ql/src/test/results/clientpositive/lineage3.q.out b6b4e0bc46d1819cebf356e8074a35103369b61c ql/src/test/results/clientpositive/subquery_notin.q.out fd6d53b69f76ec8840606abc45a01f2653dc282e ql/src/test/results/clientpositive/subquery_views.q.out 41834a3741e5843c056c1dc3e66cb867452832fc ql/src/test/results/clientpositive/tez/explainuser_1.q.out e8a978623d5333efff3834573a966bca47b11a98 ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 8f43b261f532952d2a48bef445bab48fc3c69e93 ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_2.q.out e814103aefbfe9cbe04d516787c7477e592a551e Diff: https://reviews.apache.org/r/37529/diff/ Testing --- Thanks, Jesús Camacho Rodríguez
Re: hive.ppd.remove.duplicatefilters description is incorrect. What is the correct one?
Ping. (Should we open a JIRA issue for this?) -- Lefty On Mon, Jun 1, 2015 at 6:41 PM, Lefty Leverenz leftylever...@gmail.com wrote: Good catch, Alexander! hive.ppd.remove.duplicatefilters was added in 0.8.0 by HIVE-1538 https://issues.apache.org/jira/browse/HIVE-1538 (FilterOperator is applied twice with ppd on) without any description. It isn't documented in the wiki yet. -- Lefty On Mon, Jun 1, 2015 at 12:36 PM, Alexander Pivovarov apivova...@gmail.com wrote: I noticed that conf/hive-default.xml.template has the following description property namehive.ppd.remove.duplicatefilters/name valuetrue/value descriptionWhether to push predicates down into storage handlers. Ignored when hive.optimize.ppd is false./description /property Most probably the description was taken from hive.optimize.ppd.storage So, what is the correct description for hive.ppd.remove.duplicatefilters?
[jira] [Created] (HIVE-11580) ThriftUnionObjectInspector#toString throws NPE
Jimmy Xiang created HIVE-11580: -- Summary: ThriftUnionObjectInspector#toString throws NPE Key: HIVE-11580 URL: https://issues.apache.org/jira/browse/HIVE-11580 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor ThriftUnionObjectInspector uses toString from StructObjectInspector, which accesses uninitialized member variable fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: too many 1.*.* unreleased versions on the JIRA
Thanks, Vikram and Sergey. -- Lefty On Mon, Aug 17, 2015 at 12:22 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Updated the releases. 1.0.2 and 1.2.2 are not released yet. On Fri, Aug 14, 2015 at 2:38 PM, Sergey Shelukhin ser...@hortonworks.com wrote: Anyone? :) On 15/8/13, 14:52, Sergey Shelukhin ser...@hortonworks.com wrote: On the JIRA, we currently have 1.1.0 marked as unreleased even though 1.2.0 is released (and 1.1.1 is also present); then, we have both 1.0.1 and 1.0.2, plus 1.2.1 and 1.2.2 showing in unreleased. I poked around and cannot see where this can be changed. Release managers for respective releases should probably clean this up, anyway :) -- Nothing better than when appreciated for hard work. -Mark
Re: Review Request 37329: HIVE-11513 Updating AvroLazyObjectInspector to handle bad data better
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37329/#review95604 --- Ship it! Ship It! - Xuefu Zhang On Aug. 11, 2015, 8:39 p.m., Swarnim Kulkarni wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37329/ --- (Updated Aug. 11, 2015, 8:39 p.m.) Review request for hive. Repository: hive-git Description --- HIVE-11513 Addressing review comments Diffs - serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java 9fc9873ec56a34d5026000c59376342b49b467e8 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroLazyObjectInspector.java PRE-CREATION Diff: https://reviews.apache.org/r/37329/diff/ Testing --- Unit tests added. Thanks, Swarnim Kulkarni
[jira] [Created] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.
Vaibhav Gumashta created HIVE-11581: --- Summary: HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string. Key: HIVE-11581 URL: https://issues.apache.org/jira/browse/HIVE-11581 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Currently, the client needs to specify several parameters based on which an appropriate connection is created with the server. In case of dynamic service discovery, when multiple HS2 instances are running, it is much more usable for the server to add its config parameters to ZK which the driver can use to configure the connection, instead of the jdbc/odbc user adding those in connection string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: too many 1.*.* unreleased versions on the JIRA
Updated the releases. 1.0.2 and 1.2.2 are not released yet. On Fri, Aug 14, 2015 at 2:38 PM, Sergey Shelukhin ser...@hortonworks.com wrote: Anyone? :) On 15/8/13, 14:52, Sergey Shelukhin ser...@hortonworks.com wrote: On the JIRA, we currently have 1.1.0 marked as unreleased even though 1.2.0 is released (and 1.1.1 is also present); then, we have both 1.0.1 and 1.0.2, plus 1.2.1 and 1.2.2 showing in unreleased. I poked around and cannot see where this can be changed. Release managers for respective releases should probably clean this up, anyway :) -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-11583) When PTF is used over a large partitions result could be corrupted
Illya Yalovyy created HIVE-11583: Summary: When PTF is used over a large partitions result could be corrupted Key: HIVE-11583 URL: https://issues.apache.org/jira/browse/HIVE-11583 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 1.2.1, 1.2.0, 1.0.0, 0.13.1, 0.14.0, 0.14.1 Environment: Hadoop 2.6 + Apache hive built from trunk Reporter: Illya Yalovyy Priority: Critical Dataset: Window has 50001 record (2 blocks on disk and 1 block in memory) Size of the second block is 32Mb (2 splits) Result: When the last block is read from the disk only first split is actually loaded. The second split gets missed. The total count of the result dataset is correct, but some records are missing and another are duplicated. Example: {code:sql} CREATE TABLE ptf_big_src ( id INT, key STRING, grp STRING, value STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO TABLE ptf_big_src; SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc; --- -- A25000 -- B2 -- C5001 --- CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key ORDER BY grp) grp_num FROM ptf_big_src; SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc; -- -- A34296 -- B15704 -- C1 --- {code} Counts by 'grp' are incorrect! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11582) Remove conf variable hive.mapred.supports.subdirectories
Ashutosh Chauhan created HIVE-11582: --- Summary: Remove conf variable hive.mapred.supports.subdirectories Key: HIVE-11582 URL: https://issues.apache.org/jira/browse/HIVE-11582 Project: Hive Issue Type: Task Components: Configuration Reporter: Ashutosh Chauhan This configuration is redundant since MAPREDUCE-1501 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]
Ferdinand Xu created HIVE-11579: --- Summary: Invoke the set command will close standard error output[beeline-cli] Key: HIVE-11579 URL: https://issues.apache.org/jira/browse/HIVE-11579 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu We can easily reproduce the debug by the following steps: {code} hive set system:xx=yy; hive lss; hive {code} The error output disappeared since the err outputstream is closed when closing the Hive statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 37150: HIVE-11375 Broken processing of queries containing NOT (x IS NOT NULL and x 0)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37150/ --- (Updated Aug. 17, 2015, 8:01 p.m.) Review request for hive. Repository: hive-git Description --- HIVE-11375 Broken processing of queries containing NOT (x IS NOT NULL and x 0) Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java 55ad0ce ql/src/test/queries/clientpositive/folder_predicate.q PRE-CREATION ql/src/test/results/clientpositive/folder_predicate.q.out PRE-CREATION Diff: https://reviews.apache.org/r/37150/diff/ Testing --- Thanks, Aihua Xu
[jira] [Created] (HIVE-11584) Update committer list
Dmitry Tolpeko created HIVE-11584: - Summary: Update committer list Key: HIVE-11584 URL: https://issues.apache.org/jira/browse/HIVE-11584 Project: Hive Issue Type: Bug Reporter: Dmitry Tolpeko Priority: Minor Please update committer list: Name: Dmitry Tolpeko Apache ID: dmtolpeko Organization: EPAM (www.epam.com) Thank you, Dmitry -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11587) Fix memory estimates for mapjoin hashtable
Sergey Shelukhin created HIVE-11587: --- Summary: Fix memory estimates for mapjoin hashtable Key: HIVE-11587 URL: https://issues.apache.org/jira/browse/HIVE-11587 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Wei Zheng Due to the legacy in in-memory mapjoin and conservative planning, the memory estimation code for mapjoin hashtable is currently not very good. It allocates the probe erring on the side of more memory, not taking data into account because unlike the probe, it's free to resize, so it's better for perf to allocate big and hope for the best on data size. It is not true for hybrid case. There's code to cap the initial allocation based on memory available (memSize argument), but due to some code rot, the memory estimates from planning are not even passed to hashtable anymore (there used to be two config settings, hashjoin size fraction by itself, or hashjoin size fraction for group by case), so it never caps the memory anymore below 1 Gb. Initial capacity is estimated from input key count, and in hybrid join cache can exceed Java memory due to number of segments. There needs to be a review and fix of all this code. Suggested improvements: 1) Make sure initialCapacity argument from Hybrid case is correct given the number of segments. See how it's calculated from keys for regular case; it needs to be adjusted accordingly for hybrid case if not done already. 2) Rename memSize to maxProbeSize, or something, make sure it's passed correctly based on estimates that take into account both probe and data size, esp. in hybrid case. 3) Cap single write buffer size to 8-16Mb. 4) For hybrid, don't pre-allocate WBs - only pre-allocate on write. 5) Change everywhere rounding up to power of two is used to rounding down, at least for hybrid case (?) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11585) Explicitly set pmf.setDetachAllOnCommit on metastore unless configured otherwise
Sushanth Sowmyan created HIVE-11585: --- Summary: Explicitly set pmf.setDetachAllOnCommit on metastore unless configured otherwise Key: HIVE-11585 URL: https://issues.apache.org/jira/browse/HIVE-11585 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan datanucleus.detachAllOnCommit has a default value of false. However, we've observed a number of objects (especially FieldSchema objects) being retained that causes us OOM issues on the metastore. Hive should prefer using a default of datanucleus.detachAllOnCommit as true, unless otherwise explicitly overridden by users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11586) ObjectInspectorFactory.getReflectionObjectInspector is not thread-safe
Jimmy Xiang created HIVE-11586: -- Summary: ObjectInspectorFactory.getReflectionObjectInspector is not thread-safe Key: HIVE-11586 URL: https://issues.apache.org/jira/browse/HIVE-11586 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang ObjectInspectorFactory#getReflectionObjectInspectorNoCache addes newly create object inspector to the cache before calling its init() method, to allow reusing the cache when dealing with recursive types. So a second thread can then call getReflectionObjectInspector and fetch an uninitialized instance of ReflectionStructObjectInspector. Another issue is that if two threads calls ObjectInspectorFactory.getReflectionObjectInspector at the same time. One thread could get an object inspector not in the cache, i.e. they could both call getReflectionObjectInspectorNoCache() but only one will put the new object inspector to cache successfully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Hive-0.14 - Build # 1046 - Still Failing
Changes for Build #1025 Changes for Build #1026 Changes for Build #1027 Changes for Build #1028 Changes for Build #1029 Changes for Build #1030 Changes for Build #1031 Changes for Build #1032 Changes for Build #1033 Changes for Build #1034 Changes for Build #1035 Changes for Build #1036 Changes for Build #1037 Changes for Build #1038 Changes for Build #1039 Changes for Build #1040 Changes for Build #1041 Changes for Build #1042 Changes for Build #1043 Changes for Build #1044 Changes for Build #1045 Changes for Build #1046 No tests ran. The Apache Jenkins build system has built Hive-0.14 (build #1046) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-0.14/1046/ to view the results.
Re: Review Request 35792: HIVE-10438 - Architecture for ResultSet Compression via external plugin
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35792/ --- (Updated Aug. 17, 2015, 8:32 p.m.) Review request for hive, Vaibhav Gumashta, Xiaojian Wang, Xiao Meng, and Xuefu Zhang. Changes --- 1. Fixed whitespace issues. 2. A bug in one of the tests. 3. Some generated files that are necessary weren't committed before. They are now. Repository: hive-git Description --- This patch enables ResultSet compression for Hive using external plugins. The patch proposes a plugin architecture that enables using external plugins to compress ResultSets on-the-fly. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 730f5be hcatalog/core/.gitignore 0a7a9c5 jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java bb2b695 metastore/bin/.gitignore 0e4bba6 service/if/TCLIService.thrift baf583f service/src/gen/thrift/gen-cpp/TCLIService.h 29a9f4a service/src/gen/thrift/gen-cpp/TCLIService_types.h 4536b41 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 742cfdc service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TEnColumn.java PRE-CREATION service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java feaed34 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java 805e69f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java 657f868 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java 48f4b45 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java 6e714c6 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java cc1a148 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java 1cd7980 service/src/gen/thrift/gen-py/TCLIService/ttypes.py efee8ef service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb bfb2b69 service/src/java/org/apache/hive/service/cli/Column.java 2e21f18 service/src/java/org/apache/hive/service/cli/ColumnBasedSet.java 47a582e service/src/java/org/apache/hive/service/cli/RowSetFactory.java e8f68ea service/src/java/org/apache/hive/service/cli/compression/ColumnCompressor.java PRE-CREATION service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java PRE-CREATION service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java PRE-CREATION service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 67bc778 service/src/test/org/apache/hive/service/cli/compression/SnappyIntCompressor.java PRE-CREATION service/src/test/org/apache/hive/service/cli/compression/TestEncodedColumnBasedSet.java PRE-CREATION service/src/test/resources/META-INF/services/org.apache.hive.service.cli.compression.ColumnCompressor PRE-CREATION Diff: https://reviews.apache.org/r/35792/diff/ Testing --- Testing has been done using a docker container-based query submitter that has an integer decompressor as part of it. Using the integer compressor (also provided) and the decompressor, the end-to-end functionality can be observed. File Attachments Patch file https://reviews.apache.org/media/uploaded/files/2015/06/23/16aa08f8-2393-460a-83ef-72464fc537db__HIVE-10438.patch Thanks, Rohit Dholakia