[jira] [Created] (HIVE-11591) change thrift generation to use undated annotations

2015-08-17 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11591:
---

 Summary: change thrift generation to use undated annotations
 Key: HIVE-11591
 URL: https://issues.apache.org/jira/browse/HIVE-11591
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Thrift has added class annotations to generated classes; these contain 
generation date. Because of this, all the Java thrift files change on every 
re-gen, even if you only make a small change that should not affect bazillion 
files. 

This depends on upgrading to Thrift 0.9.3, which doesn't exist yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35792: HIVE-10438 - Architecture for ResultSet Compression via external plugin

2015-08-17 Thread Rohit Dholakia

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35792/
---

(Updated Aug. 17, 2015, 10:37 p.m.)


Review request for hive, Vaibhav Gumashta, Xiaojian Wang, Xiao Meng, and Xuefu 
Zhang.


Changes
---

Reverted two .gitignore files.


Repository: hive-git


Description
---

This patch enables ResultSet compression for Hive using external plugins. The 
patch proposes a plugin architecture that enables using external plugins to 
compress ResultSets on-the-fly.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 730f5be 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java bb2b695 
  service/if/TCLIService.thrift baf583f 
  service/src/gen/thrift/gen-cpp/TCLIService.h 29a9f4a 
  service/src/gen/thrift/gen-cpp/TCLIService_types.h 4536b41 
  service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 742cfdc 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TEnColumn.java
 PRE-CREATION 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java
 feaed34 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java
 805e69f 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java
 657f868 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java
 48f4b45 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java
 6e714c6 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java
 cc1a148 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java
 1cd7980 
  service/src/gen/thrift/gen-py/TCLIService/ttypes.py efee8ef 
  service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb bfb2b69 
  service/src/java/org/apache/hive/service/cli/Column.java 2e21f18 
  service/src/java/org/apache/hive/service/cli/ColumnBasedSet.java 47a582e 
  service/src/java/org/apache/hive/service/cli/RowSetFactory.java e8f68ea 
  
service/src/java/org/apache/hive/service/cli/compression/ColumnCompressor.java 
PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java
 PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java
 PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
67bc778 
  
service/src/test/org/apache/hive/service/cli/compression/SnappyIntCompressor.java
 PRE-CREATION 
  
service/src/test/org/apache/hive/service/cli/compression/TestEncodedColumnBasedSet.java
 PRE-CREATION 
  
service/src/test/resources/META-INF/services/org.apache.hive.service.cli.compression.ColumnCompressor
 PRE-CREATION 

Diff: https://reviews.apache.org/r/35792/diff/


Testing
---

Testing has been done using a docker container-based query submitter that has 
an integer decompressor as part of it. Using the integer compressor (also 
provided) and the decompressor, the end-to-end functionality can be observed.


File Attachments


Patch file
  
https://reviews.apache.org/media/uploaded/files/2015/06/23/16aa08f8-2393-460a-83ef-72464fc537db__HIVE-10438.patch


Thanks,

Rohit Dholakia



[jira] [Created] (HIVE-11588) merge master into branch

2015-08-17 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11588:
---

 Summary: merge master into branch
 Key: HIVE-11588
 URL: https://issues.apache.org/jira/browse/HIVE-11588
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch


NO PRECOMMIT TESTS




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11590) AvroDeserializer is very chatty

2015-08-17 Thread Swarnim Kulkarni (JIRA)
Swarnim Kulkarni created HIVE-11590:
---

 Summary: AvroDeserializer is very chatty
 Key: HIVE-11590
 URL: https://issues.apache.org/jira/browse/HIVE-11590
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni


It seems like AvroDeserializer is currently very chatty with it logging tons of 
messages at INFO level in the mapreduce logs. It would be helpful to push down 
some of these to debug level to keep the logs clean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11592) ORC metadata section can sometimes exceed protobuf message size limit

2015-08-17 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-11592:


 Summary: ORC metadata section can sometimes exceed protobuf 
message size limit
 Key: HIVE-11592
 URL: https://issues.apache.org/jira/browse/HIVE-11592
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


If there are too many small stripes and with many columns, the overhead for 
storing metadata (column stats) can exceed the default protobuf message size of 
64MB. Reading such files will throw the following exception
{code}
Exception in thread main com.google.protobuf.InvalidProtocolBufferException: 
Protocol message was too large.  May be malicious.  Use 
CodedInputStream.setSizeLimit() to increase the size limit.
at 
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at 
com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
at 
com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.init(OrcProto.java:1331)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.init(OrcProto.java:1281)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1374)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1369)
at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.init(OrcProto.java:4887)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.init(OrcProto.java:4803)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4990)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4985)
at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.init(OrcProto.java:12925)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.init(OrcProto.java:12872)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12961)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12956)
at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.init(OrcProto.java:13599)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.init(OrcProto.java:13546)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13635)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13630)
at 
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.parseFrom(OrcProto.java:13746)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl$MetaInfoObjExtractor.init(ReaderImpl.java:468)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.init(ReaderImpl.java:314)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:228)
at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

The only solution for this is to programmatically increase the CodeInputStream 
size limit. We should make this configurable via hive config so that the orc 
file is readable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11589) Invalid value such as '-1' should be checked for 'hive.txn.timeout'.

2015-08-17 Thread Takahiko Saito (JIRA)
Takahiko Saito created HIVE-11589:
-

 Summary: Invalid value such as '-1' should be checked for 
'hive.txn.timeout'.
 Key: HIVE-11589
 URL: https://issues.apache.org/jira/browse/HIVE-11589
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.2.1
Reporter: Takahiko Saito
Assignee: Eugene Koifman
Priority: Minor


When an user accidentally set an invalid value such as '-1' for 
'hive.txn.timeout', the query simply fails throwing 'NoSuchLockException'
{noformat}
2015-08-16 23:25:43,149 ERROR [HiveServer2-Background-Pool: Thread-206]: 
metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(159)) - 
NoSuchLockException(message:No such lock: 40)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1710)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:501)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:5571)
at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy7.unlock(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1876)
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
at com.sun.proxy.$Proxy8.unlock(Unknown Source)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:134)
at 
org.apache.hadoop.hive.ql.lockmgr.DbLockManager.releaseLocks(DbLockManager.java:153)
at 
org.apache.hadoop.hive.ql.Driver.releaseLocksAndCommitOrRollback(Driver.java:1038)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1208)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at 
org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}

The better way to handle such an invalid value is to check the value before it 
throws NoSuchLockException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 37563: HIVE-10697 Fix for ObjectInspectorConvertors#UnionConvertor doing a faulty conversion

2015-08-17 Thread Swarnim Kulkarni

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37563/
---

Review request for hive and Hari Sankar Sivarama Subramaniyan.


Repository: hive-git


Description
---

HIVE-10697 Fix for ObjectInspectorConvertors#UnionConvertor doing a faulty 
conversion


Diffs
-

  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java
 8ef8ce1736d50f0f9163cd5e3fd00ddd4bd810da 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SettableUnionObjectInspector.java
 a64aee074d05a14c3c72079ff960039811936419 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardUnionObjectInspector.java
 d1b11e82730e57f6894145478aae7c0c0c26e518 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java
 11852833577149152cefaef50ee49328733c9dde 

Diff: https://reviews.apache.org/r/37563/diff/


Testing
---

Unit tests added.


Thanks,

Swarnim Kulkarni



Review Request 37529: HIVE-11383

2015-08-17 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37529/
---

Review request for hive.


Bugs: HIVE-11383
https://issues.apache.org/jira/browse/HIVE-11383


Repository: hive-git


Description
---

HIVE-11383


Diffs
-

  pom.xml 15c280561fa1b1382c90de549a6f669088b0a524 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java 
5a5954dd984b0a0a00080d25c6c5dfe512f56141 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveSort.java
 18d283824a02594a74d4c10192a583d496b25dd4 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinProjectTransposeRule.java
 fd8f5cb6d56d99c8a7d5215f39fe06fd6069c241 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java
 29deed9ffe866ba343444a62c133da7fc2b13bf1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 
f26d1dfd59598f9a0e29514124ef0646fa40be50 
  ql/src/test/results/clientpositive/auto_join13.q.out 
952dbf885665f8b23ce563c6811de0e2234d6b9e 
  ql/src/test/results/clientpositive/filter_cond_pushdown.q.out 
e09057a7f2904e14b7d301997aac871350501313 
  ql/src/test/results/clientpositive/join13.q.out 
3b921b99d9fc603c983d59b8987a36700499dec9 
  ql/src/test/results/clientpositive/lineage3.q.out 
b6b4e0bc46d1819cebf356e8074a35103369b61c 
  ql/src/test/results/clientpositive/subquery_notin.q.out 
fd6d53b69f76ec8840606abc45a01f2653dc282e 
  ql/src/test/results/clientpositive/subquery_views.q.out 
41834a3741e5843c056c1dc3e66cb867452832fc 
  ql/src/test/results/clientpositive/tez/explainuser_1.q.out 
e8a978623d5333efff3834573a966bca47b11a98 
  ql/src/test/results/clientpositive/tez/tez_dynpart_hashjoin_2.q.out 
8f43b261f532952d2a48bef445bab48fc3c69e93 
  ql/src/test/results/clientpositive/tez/tez_vector_dynpart_hashjoin_2.q.out 
e814103aefbfe9cbe04d516787c7477e592a551e 

Diff: https://reviews.apache.org/r/37529/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



Re: hive.ppd.remove.duplicatefilters description is incorrect. What is the correct one?

2015-08-17 Thread Lefty Leverenz
Ping.  (Should we open a JIRA issue for this?)

-- Lefty


On Mon, Jun 1, 2015 at 6:41 PM, Lefty Leverenz leftylever...@gmail.com
wrote:

 Good catch, Alexander!

 hive.ppd.remove.duplicatefilters was added in 0.8.0 by HIVE-1538
 https://issues.apache.org/jira/browse/HIVE-1538 (FilterOperator is
 applied twice with ppd on) without any description.  It isn't documented in
 the wiki yet.

 -- Lefty

 On Mon, Jun 1, 2015 at 12:36 PM, Alexander Pivovarov apivova...@gmail.com
  wrote:

 I noticed that conf/hive-default.xml.template has the following
 description

   property
 namehive.ppd.remove.duplicatefilters/name
 valuetrue/value
 descriptionWhether to push predicates down into storage handlers.
 Ignored when hive.optimize.ppd is false./description
   /property

 Most probably the description was taken from hive.optimize.ppd.storage

 So, what is the correct description for
 hive.ppd.remove.duplicatefilters?





[jira] [Created] (HIVE-11580) ThriftUnionObjectInspector#toString throws NPE

2015-08-17 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HIVE-11580:
--

 Summary: ThriftUnionObjectInspector#toString throws NPE
 Key: HIVE-11580
 URL: https://issues.apache.org/jira/browse/HIVE-11580
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor


ThriftUnionObjectInspector uses toString from StructObjectInspector, which 
accesses uninitialized member variable fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: too many 1.*.* unreleased versions on the JIRA

2015-08-17 Thread Lefty Leverenz
Thanks, Vikram and Sergey.

-- Lefty

On Mon, Aug 17, 2015 at 12:22 PM, Vikram Dixit K vikram.di...@gmail.com
wrote:

 Updated the releases. 1.0.2 and 1.2.2 are not released yet.

 On Fri, Aug 14, 2015 at 2:38 PM, Sergey Shelukhin ser...@hortonworks.com
 wrote:

  Anyone? :)
 
  On 15/8/13, 14:52, Sergey Shelukhin ser...@hortonworks.com wrote:
 
  On the JIRA, we currently have 1.1.0 marked as unreleased even though
  1.2.0 is released (and 1.1.1 is also present); then, we have both 1.0.1
  and 1.0.2, plus 1.2.1 and 1.2.2 showing in unreleased.
  I poked around and cannot see where this can be changed. Release
 managers
  for respective releases should probably clean this up, anyway :)
  
  
 
 


 --
 Nothing better than when appreciated for hard work.
 -Mark



Re: Review Request 37329: HIVE-11513 Updating AvroLazyObjectInspector to handle bad data better

2015-08-17 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37329/#review95604
---

Ship it!


Ship It!

- Xuefu Zhang


On Aug. 11, 2015, 8:39 p.m., Swarnim Kulkarni wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/37329/
 ---
 
 (Updated Aug. 11, 2015, 8:39 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-11513 Addressing review comments
 
 
 Diffs
 -
 
   
 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java
  9fc9873ec56a34d5026000c59376342b49b467e8 
   
 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroLazyObjectInspector.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/37329/diff/
 
 
 Testing
 ---
 
 Unit tests added.
 
 
 Thanks,
 
 Swarnim Kulkarni
 




[jira] [Created] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2015-08-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-11581:
---

 Summary: HiveServer2 should store connection params in ZK when 
using dynamic service discovery for simpler client connection string.
 Key: HIVE-11581
 URL: https://issues.apache.org/jira/browse/HIVE-11581
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


Currently, the client needs to specify several parameters based on which an 
appropriate connection is created with the server. In case of dynamic service 
discovery, when multiple HS2 instances are running, it is much more usable for 
the server to add its config parameters to ZK which the driver can use to 
configure the connection, instead of the jdbc/odbc user adding those in 
connection string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: too many 1.*.* unreleased versions on the JIRA

2015-08-17 Thread Vikram Dixit K
Updated the releases. 1.0.2 and 1.2.2 are not released yet.

On Fri, Aug 14, 2015 at 2:38 PM, Sergey Shelukhin ser...@hortonworks.com
wrote:

 Anyone? :)

 On 15/8/13, 14:52, Sergey Shelukhin ser...@hortonworks.com wrote:

 On the JIRA, we currently have 1.1.0 marked as unreleased even though
 1.2.0 is released (and 1.1.1 is also present); then, we have both 1.0.1
 and 1.0.2, plus 1.2.1 and 1.2.2 showing in unreleased.
 I poked around and cannot see where this can be changed. Release managers
 for respective releases should probably clean this up, anyway :)
 
 




-- 
Nothing better than when appreciated for hard work.
-Mark


[jira] [Created] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-08-17 Thread Illya Yalovyy (JIRA)
Illya Yalovyy created HIVE-11583:


 Summary: When PTF is used over a large partitions result could be 
corrupted
 Key: HIVE-11583
 URL: https://issues.apache.org/jira/browse/HIVE-11583
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 1.2.1, 1.2.0, 1.0.0, 0.13.1, 0.14.0, 0.14.1
 Environment: Hadoop 2.6 + Apache hive built from trunk

Reporter: Illya Yalovyy
Priority: Critical


Dataset: 
 Window has 50001 record (2 blocks on disk and 1 block in memory)
 Size of the second block is 32Mb (2 splits)

Result:
When the last block is read from the disk only first split is actually loaded. 
The second split gets missed. The total count of the result dataset is correct, 
but some records are missing and another are duplicated.

Example:
{code:sql}
CREATE TABLE ptf_big_src (
  id INT,
  key STRING,
  grp STRING,
  value STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
TABLE ptf_big_src;

SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
---
-- A25000
-- B2
-- C5001
---

CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key ORDER 
BY grp) grp_num FROM ptf_big_src;

SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
-- 
-- A34296
-- B15704
-- C1
---
{code}
Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11582) Remove conf variable hive.mapred.supports.subdirectories

2015-08-17 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-11582:
---

 Summary: Remove conf variable hive.mapred.supports.subdirectories
 Key: HIVE-11582
 URL: https://issues.apache.org/jira/browse/HIVE-11582
 Project: Hive
  Issue Type: Task
  Components: Configuration
Reporter: Ashutosh Chauhan


This configuration is redundant since MAPREDUCE-1501 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]

2015-08-17 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-11579:
---

 Summary: Invoke the set command will close standard error 
output[beeline-cli]
 Key: HIVE-11579
 URL: https://issues.apache.org/jira/browse/HIVE-11579
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu


We can easily reproduce the debug by the following steps:
{code}
hive set system:xx=yy;
hive lss;
hive 
{code}
The error output disappeared since the err outputstream is closed when closing 
the Hive statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37150: HIVE-11375 Broken processing of queries containing NOT (x IS NOT NULL and x 0)

2015-08-17 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37150/
---

(Updated Aug. 17, 2015, 8:01 p.m.)


Review request for hive.


Repository: hive-git


Description
---

HIVE-11375 Broken processing of queries containing NOT (x IS NOT NULL and x  
0)


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 55ad0ce 
  ql/src/test/queries/clientpositive/folder_predicate.q PRE-CREATION 
  ql/src/test/results/clientpositive/folder_predicate.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/37150/diff/


Testing
---


Thanks,

Aihua Xu



[jira] [Created] (HIVE-11584) Update committer list

2015-08-17 Thread Dmitry Tolpeko (JIRA)
Dmitry Tolpeko created HIVE-11584:
-

 Summary: Update committer list
 Key: HIVE-11584
 URL: https://issues.apache.org/jira/browse/HIVE-11584
 Project: Hive
  Issue Type: Bug
Reporter: Dmitry Tolpeko
Priority: Minor


Please update committer list:

Name: Dmitry Tolpeko
Apache ID: dmtolpeko
Organization: EPAM (www.epam.com)

Thank you,

Dmitry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11587) Fix memory estimates for mapjoin hashtable

2015-08-17 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-11587:
---

 Summary: Fix memory estimates for mapjoin hashtable
 Key: HIVE-11587
 URL: https://issues.apache.org/jira/browse/HIVE-11587
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Wei Zheng


Due to the legacy in in-memory mapjoin and conservative planning, the memory 
estimation code for mapjoin hashtable is currently not very good. It allocates 
the probe erring on the side of more memory, not taking data into account 
because unlike the probe, it's free to resize, so it's better for perf to 
allocate big and hope for the best on data size. It is not true for hybrid case.
There's code to cap the initial allocation based on memory available (memSize 
argument), but due to some code rot, the memory estimates from planning are not 
even passed to hashtable anymore (there used to be two config settings, 
hashjoin size fraction by itself, or hashjoin size fraction for group by case), 
so it never caps the memory anymore below 1 Gb. 
Initial capacity is estimated from input key count, and in hybrid join cache 
can exceed Java memory due to number of segments.

There needs to be a review and fix of all this code.
Suggested improvements:
1) Make sure initialCapacity argument from Hybrid case is correct given the 
number of segments. See how it's calculated from keys for regular case; it 
needs to be adjusted accordingly for hybrid case if not done already.
2) Rename memSize to maxProbeSize, or something, make sure it's passed 
correctly based on estimates that take into account both probe and data size, 
esp. in hybrid case.
3) Cap single write buffer size to 8-16Mb.
4) For hybrid, don't pre-allocate WBs - only pre-allocate on write.
5) Change everywhere rounding up to power of two is used to rounding down, at 
least for hybrid case (?)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11585) Explicitly set pmf.setDetachAllOnCommit on metastore unless configured otherwise

2015-08-17 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-11585:
---

 Summary: Explicitly set pmf.setDetachAllOnCommit on metastore 
unless configured otherwise
 Key: HIVE-11585
 URL: https://issues.apache.org/jira/browse/HIVE-11585
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


datanucleus.detachAllOnCommit has a default value of false. However, we've 
observed a number of objects (especially FieldSchema objects) being retained  
that causes us OOM issues on the metastore. Hive should prefer using a default 
of datanucleus.detachAllOnCommit as true, unless otherwise explicitly 
overridden by users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11586) ObjectInspectorFactory.getReflectionObjectInspector is not thread-safe

2015-08-17 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HIVE-11586:
--

 Summary: ObjectInspectorFactory.getReflectionObjectInspector is 
not thread-safe
 Key: HIVE-11586
 URL: https://issues.apache.org/jira/browse/HIVE-11586
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


ObjectInspectorFactory#getReflectionObjectInspectorNoCache addes newly create 
object inspector to the cache before calling its init() method, to allow 
reusing the cache when dealing with recursive types. So a second thread can 
then call getReflectionObjectInspector and fetch an uninitialized instance of 
ReflectionStructObjectInspector.

Another issue is that if two threads calls 
ObjectInspectorFactory.getReflectionObjectInspector at the same time. One 
thread could get an object inspector not in the cache, i.e. they could both 
call getReflectionObjectInspectorNoCache() but only one will put the new object 
inspector to cache successfully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hive-0.14 - Build # 1046 - Still Failing

2015-08-17 Thread Apache Jenkins Server
Changes for Build #1025

Changes for Build #1026

Changes for Build #1027

Changes for Build #1028

Changes for Build #1029

Changes for Build #1030

Changes for Build #1031

Changes for Build #1032

Changes for Build #1033

Changes for Build #1034

Changes for Build #1035

Changes for Build #1036

Changes for Build #1037

Changes for Build #1038

Changes for Build #1039

Changes for Build #1040

Changes for Build #1041

Changes for Build #1042

Changes for Build #1043

Changes for Build #1044

Changes for Build #1045

Changes for Build #1046



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #1046)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-0.14/1046/ to view 
the results.

Re: Review Request 35792: HIVE-10438 - Architecture for ResultSet Compression via external plugin

2015-08-17 Thread Rohit Dholakia

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35792/
---

(Updated Aug. 17, 2015, 8:32 p.m.)


Review request for hive, Vaibhav Gumashta, Xiaojian Wang, Xiao Meng, and Xuefu 
Zhang.


Changes
---

1. Fixed whitespace issues. 
2. A bug in one of the tests. 
3. Some generated files that are necessary weren't committed before. They are 
now.


Repository: hive-git


Description
---

This patch enables ResultSet compression for Hive using external plugins. The 
patch proposes a plugin architecture that enables using external plugins to 
compress ResultSets on-the-fly.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 730f5be 
  hcatalog/core/.gitignore 0a7a9c5 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java bb2b695 
  metastore/bin/.gitignore 0e4bba6 
  service/if/TCLIService.thrift baf583f 
  service/src/gen/thrift/gen-cpp/TCLIService.h 29a9f4a 
  service/src/gen/thrift/gen-cpp/TCLIService_types.h 4536b41 
  service/src/gen/thrift/gen-cpp/TCLIService_types.cpp 742cfdc 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TEnColumn.java
 PRE-CREATION 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java
 feaed34 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java
 805e69f 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java
 657f868 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java
 48f4b45 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TProtocolVersion.java
 6e714c6 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java
 cc1a148 
  
service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java
 1cd7980 
  service/src/gen/thrift/gen-py/TCLIService/ttypes.py efee8ef 
  service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb bfb2b69 
  service/src/java/org/apache/hive/service/cli/Column.java 2e21f18 
  service/src/java/org/apache/hive/service/cli/ColumnBasedSet.java 47a582e 
  service/src/java/org/apache/hive/service/cli/RowSetFactory.java e8f68ea 
  
service/src/java/org/apache/hive/service/cli/compression/ColumnCompressor.java 
PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/compression/ColumnCompressorService.java
 PRE-CREATION 
  
service/src/java/org/apache/hive/service/cli/compression/EncodedColumnBasedSet.java
 PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
67bc778 
  
service/src/test/org/apache/hive/service/cli/compression/SnappyIntCompressor.java
 PRE-CREATION 
  
service/src/test/org/apache/hive/service/cli/compression/TestEncodedColumnBasedSet.java
 PRE-CREATION 
  
service/src/test/resources/META-INF/services/org.apache.hive.service.cli.compression.ColumnCompressor
 PRE-CREATION 

Diff: https://reviews.apache.org/r/35792/diff/


Testing
---

Testing has been done using a docker container-based query submitter that has 
an integer decompressor as part of it. Using the integer compressor (also 
provided) and the decompressor, the end-to-end functionality can be observed.


File Attachments


Patch file
  
https://reviews.apache.org/media/uploaded/files/2015/06/23/16aa08f8-2393-460a-83ef-72464fc537db__HIVE-10438.patch


Thanks,

Rohit Dholakia