Review Request 54022: HIVE-15270 ExprNode/Sarg changes to support values supplied during query runtime

2016-11-22 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54022/
---

Review request for hive, Ashutosh Chauhan and Prasanth_J.


Bugs: HIVe-15270
https://issues.apache.org/jira/browse/HIVe-15270


Repository: hive-git


Description
---

- Some concept of available runtime values that can be retrieved for a 
MapWork/ReduceWork
- ExprNode/Sarg changes to pass a Conf during initialization - this allows the 
expression to retrieve the MapWork at query time (using 
Utilities.getMapWork(Configuration)) to access runtime-supplied values.
- Ability to populate the runtime values in Tez mode via incoming Tez edges


Diffs
-

  orc/src/test/org/apache/orc/impl/TestRecordReaderImpl.java cdd62ac 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
69ba4a2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 5512ee2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DynamicValueRegistry.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java 
24c8281 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantDefaultEvaluator.java
 89a75eb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java 
4fe72a0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeDynamicValueEvaluator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java b8d6ab7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java 
0d03d8f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorHead.java 42685fb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java 0a6b66a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java 
ff32626 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java 
221abd9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java bd0d28c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java f28d33e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java ac5331e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 6cbcab6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 9049ddd 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DynamicValueRegistryTez.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
6f36dfb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 
cf3c8ab 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
0cb6c8a 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 
80b0a14 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
9013084 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
 9e9beb0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 13a0811 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicValue.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDynamicValueDesc.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
93b50a6 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 
8cbc26d 
  storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/LiteralDelegate.java 
PRE-CREATION 
  
storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java
 8fda95c 
  
storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java 
10d8c51 

Diff: https://reviews.apache.org/r/54022/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Created] (HIVE-15270) ExprNode/Sarg changes to support values supplied during query runtime

2016-11-22 Thread Jason Dere (JIRA)
Jason Dere created HIVE-15270:
-

 Summary: ExprNode/Sarg changes to support values supplied during 
query runtime
 Key: HIVE-15270
 URL: https://issues.apache.org/jira/browse/HIVE-15270
 Project: Hive
  Issue Type: Improvement
Reporter: Jason Dere
Assignee: Jason Dere


Infrastructure changes to support retrieval of query-runtime supplied values, 
needed for dynamic min/max (HIVE-15269) and bloomfilter join optimizations.
- Some concept of available runtime values that can be retrieved for a 
MapWork/ReduceWork
- ExprNode/Sarg changes to pass a Conf during initialization - this allows the 
expression to retrieve the MapWork at query time (using 
Utilities.getMapWork(Configuration)) to access runtime-supplied values.
- Ability to populate the runtime values in Tez mode via incoming Tez edges



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2016-11-22 Thread Jason Dere (JIRA)
Jason Dere created HIVE-15269:
-

 Summary: Dynamic Min-Max runtime-filtering for Tez
 Key: HIVE-15269
 URL: https://issues.apache.org/jira/browse/HIVE-15269
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Deepak Jaiswal


If a dimension table and fact table are joined:
{noformat}
select *
from store join store_sales on (store.id = store_sales.store_id)
where store.s_store_name = 'My Store'
{noformat}

One optimization that can be done is to get the min/max store id values that 
come out of the scan/filter of the store table, and send this min/max value 
(via Tez edge) to the task which is scanning the store_sales table.
We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where 
this predicate can be pushed down to the storage handler (for example for ORC 
formats). Pushing a min/max predicate to the ORC reader would allow us to avoid 
having to entire whole row groups during the table scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [hive/ql] discussions of HIVE-15221, the right memory checkpoint

2016-11-22 Thread Hui Fei
add the link for HIVE-15221 https://issues.apache.org/jira/browse/HIVE-15221
thanks

2016-11-23 10:41 GMT+08:00 Hui Fei :

> hi all
>
> In checkMemoryStatus function, i think maybe memory comparison is
> meaningful after garbage collection.
>
> If that, we have additional benefit: the task maybe go on running after
> gc, rather than fail.
>
> In addition, i think the memory checkpoint is not so good. Because we
> don't know when gc happens. After gc, the check maybe pass, but maybe fail
> before gc. The check point does not show how much we have real used memory
> , which does not contain garbage.
>
> Any suggestions ?
>


[jira] [Created] (HIVE-15268) limit+offset is broken (differently for ACID or not)

2016-11-22 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-15268:
---

 Summary: limit+offset is broken (differently for ACID or not)
 Key: HIVE-15268
 URL: https://issues.apache.org/jira/browse/HIVE-15268
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


I think some part of putting limit on the map side implicitly assumes there is 
CombineHiveInputFormat; when splits are not combined, results are incorrect. In 
fact they are also incorrect for ORC, although differently, even though it 
seems like it should combined splits. I didn't fully investigate.
IIRC results are correct with text.

{noformat}
set hive.fetch.task.conversion=none;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.exec.dynamic.partition.mode=nonstrict;

CREATE TABLE limitoffset_text (key STRING, value STRING) PARTITIONED BY (ds 
STRING, hr STRING);
CREATE TABLE limitoffset (key STRING, value STRING) PARTITIONED BY (ds STRING, 
hr STRING) STORED AS orc;
create table acid_dynamic(key STRING, value STRING) PARTITIONED BY (ds STRING, 
hr STRING) 
clustered by (key) into 2 buckets stored as orc TBLPROPERTIES 
('transactional'='true');

insert INTO TABLE limitoffset PARTITION (ds, hr) select * from srcpart;
insert INTO TABLE limitoffset_text PARTITION (ds, hr) select * from srcpart;
insert INTO TABLE acid_dynamic PARTITION (ds, hr) select * from srcpart;

select count(key) from limitoffset_text;
select count(key) from limitoffset;
select count(key) from acid_dynamic;

SELECT limitoffset_text.key FROM limitoffset_text LIMIT 490,200;
SELECT acid_dynamic.key FROM acid_dynamic LIMIT 490,200;
SELECT limitoffset.key FROM limitoffset LIMIT 490,200;
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[hive/ql] discussions of HIVE-15221, the right memory checkpoint

2016-11-22 Thread Hui Fei
hi all

In checkMemoryStatus function, i think maybe memory comparison is
meaningful after garbage collection.

If that, we have additional benefit: the task maybe go on running after gc,
rather than fail.

In addition, i think the memory checkpoint is not so good. Because we don't
know when gc happens. After gc, the check maybe pass, but maybe fail before
gc. The check point does not show how much we have real used memory , which
does not contain garbage.

Any suggestions ?


Re: Review Request 53915: Extend JSONMessageFactory to store additional information about metadata objects on different table events

2016-11-22 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53915/#review156688
---


Ship it!




Ship It!

- Thejas Nair


On Nov. 22, 2016, 11:02 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53915/
> ---
> 
> (Updated Nov. 22, 2016, 11:02 p.m.)
> 
> 
> Review request for hive, Sushanth Sowmyan and Thejas Nair.
> 
> 
> Bugs: HIVE-15180
> https://issues.apache.org/jira/browse/HIVE-15180
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-15180
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a9474c4 
>   
> hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
>  ea7520d 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/api/TestHCatClientNotification.java
>  a661962 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
>  4f97cf4 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AddPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateDatabaseMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateFunctionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropDatabaseMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropFunctionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/EventMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/EventUtils.java 
> PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageDeserializer.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAddPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateDatabaseMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateFunctionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropDatabaseMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropFunctionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropIndexMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropPartitionMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropTableMessage.java
>  PRE-CREATION 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
>  PRE-CREATION 
>   
> 

Re: Review Request 53915: Extend JSONMessageFactory to store additional information about metadata objects on different table events

2016-11-22 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53915/
---

(Updated Nov. 22, 2016, 11:02 p.m.)


Review request for hive, Sushanth Sowmyan and Thejas Nair.


Bugs: HIVE-15180
https://issues.apache.org/jira/browse/HIVE-15180


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-15180


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a9474c4 
  
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
 ea7520d 
  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/api/TestHCatClientNotification.java
 a661962 
  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java
 4f97cf4 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AddPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/AlterTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateDatabaseMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateFunctionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/CreateTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropDatabaseMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropFunctionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/DropTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/EventMessage.java 
PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/messaging/EventUtils.java 
PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageDeserializer.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/MessageFactory.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAddPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONAlterTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateDatabaseMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateFunctionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONCreateTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropDatabaseMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropFunctionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropIndexMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropPartitionMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONDropTableMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageDeserializer.java
 PRE-CREATION 
  
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/53915/diff/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Yongzhi Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156659
---



The latest patch solved all the issues Illya Yalovyy pointed out, the fix looks 
good. 
+1

- Yongzhi Chen


On Nov. 22, 2016, 10:35 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 22, 2016, 10:35 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/
---

(Updated Nov. 22, 2016, 10:35 p.m.)


Review request for hive.


Changes
---

Addressed issues from Illya Yalovyy.

Also, I sticked to the if (!exists || !rename) condition on S3, and not using 
the listFiles() to avoid OOM issues with concurrent HS2 requests. We can design 
a better performance work in a different JIRA.


Bugs: HIVE-15199
https://issues.apache.org/jira/browse/HIVE-15199


Repository: hive-git


Description
---

The patch helps execute repeated INSERT INTO statements on S3 tables when the 
scratch directory is on S3.


Diffs (updated)
-

  itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
  itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 

Diff: https://reviews.apache.org/r/53966/diff/


Testing
---


Thanks,

Sergio Pena



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Sergio Pena


> On Nov. 22, 2016, 9:30 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2789
> > 
> >
> > Scalability concern:
> > 
> > On some real datasets, it could be millions of elements in that list. 
> > If it happens in HS2 with many cocurrent connection this jvm can easily go 
> > down with OOM Exceptions. I would suggest reconsider that approach.

You're right, I did not see that case. Probably it would better to stick with 
the if (!exists && !rename) condition. This would be slow when doing many 
repeated INSERT INTO, but it will not have problems with memory.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156629
---


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Sergio Pena


> On Nov. 22, 2016, 9:30 p.m., Illya Yalovyy wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2953
> > 
> >
> > is "copy" part of the file name misleading? It is not actually a copy 
> > of an original file.

Correct, but that's what the old code is using:

for (int counter = 1; !destFs.rename(srcP,destPath); counter++) {
destPath = new Path(destf, name + ("_copy_" + counter) + filetype);
}

The filename could be better, but I'm using the same one to fix the issue as I 
will need to investigate more in the code to see if that does not affect 
something.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156629
---


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Yongzhi Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156644
---


Ship it!




Ship It!

- Yongzhi Chen


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Illya Yalovyy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156629
---




common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1007)


Can we use an explicit type as return type? Something like 
final class NameAndType {
  final String name;
  final String type;
}



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (lines 1010 - 1019)


return new NameAndType(FilenameUtils.getBaseName(filename), 
FilenameUtils.getExtension(filename));


https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1015)


extra ";" is not needed.



common/src/java/org/apache/hadoop/hive/common/FileUtils.java (line 1019)


This utility method should be covered with unit tests. Please make sure you 
have covered cases like:

s3://mybucket.test/foo/bar/0_0



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2786)


Scalability concern:

On some real datasets, it could be millions of elements in that list. If it 
happens in HS2 with many cocurrent connection this jvm can easily go down with 
OOM Exceptions. I would suggest reconsider that approach.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2921)


is "copy" part of the file name misleading? It is not actually a copy of an 
original file.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2923)


Just a note:
filename + "." + filetype is 10x faster than String.format("%s%s", 
filename, filetype).

Also it seems like "." is missing.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2928)


"." is missing between name and type.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2943)


FilenameUtils can do the job:

https://commons.apache.org/proper/commons-io/javadocs/api-1.4/org/apache/commons/io/FilenameUtils.html


- Illya Yalovyy


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Sergio Pena


> On Nov. 22, 2016, 9:27 p.m., Yongzhi Chen wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2983
> > 
> >
> > if (isBlobStoragePath && !destFs.exists(destFilePath)
> > then the second condition :
> > !destFs.rename(sourcePath, destFilePath) will be evaluated. I assume 
> > you do not want that be called, right

I wanted to be called. For HDFS, the rename() will return 'false' if the 
destination file already exists, but in S3, the rename will return 'true' even 
if the file already exists, so that's why I use 'exists' for S3 only.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156639
---


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 53966: HIVE-15199: INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-22 Thread Yongzhi Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53966/#review156639
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java (line 2951)


if (isBlobStoragePath && !destFs.exists(destFilePath)
then the second condition :
!destFs.rename(sourcePath, destFilePath) will be evaluated. I assume you do 
not want that be called, right


- Yongzhi Chen


On Nov. 21, 2016, 11:54 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53966/
> ---
> 
> (Updated Nov. 21, 2016, 11:54 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-15199
> https://issues.apache.org/jira/browse/HIVE-15199
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch helps execute repeated INSERT INTO statements on S3 tables when the 
> scratch directory is on S3.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
> 1d8c04160c35e48781b20f8e6e14760c19df9ca5 
>   itests/hive-blobstore/src/test/queries/clientpositive/insert_into.q 
> 919ff7d9c7cb40062d68b876d6acbc8efb8a8cf1 
>   itests/hive-blobstore/src/test/results/clientpositive/insert_into.q.out 
> c25d0c4eec6983b6869e2eba711b39ba91a4c6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 61b8bd0ac40cffcd6dca0fc874940066bc0aeffe 
> 
> Diff: https://reviews.apache.org/r/53966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



[jira] [Created] (HIVE-15267) Make query length calculation logic more accurate in TxnUtils.needNewQuery()

2016-11-22 Thread Wei Zheng (JIRA)
Wei Zheng created HIVE-15267:


 Summary: Make query length calculation logic more accurate in 
TxnUtils.needNewQuery()
 Key: HIVE-15267
 URL: https://issues.apache.org/jira/browse/HIVE-15267
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions
Affects Versions: 2.1.0, 1.2.1
Reporter: Wei Zheng
Assignee: Wei Zheng


In HIVE-15181 there's such review comment, for which this ticket will handle
{code}
in TxnUtils.needNewQuery() "sizeInBytes / 1024 > queryMemoryLimit" doesn't do 
the right thing.
If the user sets METASTORE_DIRECT_SQL_MAX_QUERY_LENGTH to 1K, they most likely 
want each SQL string to be at most 1K.
But if sizeInBytes=2047, this still returns false.
It should include length of "suffix" in computation of sizeInBytes
Along the same lines: the check for max query length is done after each batch 
is already added to the query. Suppose there are 1000 9-digit txn IDs in each 
IN(...). That's, conservatively, 18KB of text. So the length of each query is 
increasing in 18KB chunks. 
I think the check for query length should be done for each item in IN clause.
If some DB has a limit on query length of X, then any query > X will fail. So I 
think this must ensure not to produce any queries > X, even by 1 char.
For example, case 3.1 of the UT generates a query of almost 4000 characters - 
this is clearly > 1KB.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15266) Edit test output of negative blobstore tests to match HIVE-15226

2016-11-22 Thread Thomas Poepping (JIRA)
Thomas Poepping created HIVE-15266:
--

 Summary: Edit test output of negative blobstore tests to match 
HIVE-15226
 Key: HIVE-15266
 URL: https://issues.apache.org/jira/browse/HIVE-15266
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 2.2.0
Reporter: Thomas Poepping
Assignee: Thomas Poepping


In HIVE-15226 ( https://issues.apache.org/jira/browse/HIVE-15226 ), blobstore 
tests were changed to print a different masking pattern for the blobstore path. 
In that patch, test output was replaced for the clientpositive test ( 
insert_into.q ), but not for the clientnegative test ( select_dropped_table.q 
), causing the negative tests to fail.

This patch is the result of -Dtest.output.overwrite=true with the 
clientnegative tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15265) support snapshot isolation for MM tables

2016-11-22 Thread Wei Zheng (JIRA)
Wei Zheng created HIVE-15265:


 Summary: support snapshot isolation for MM tables
 Key: HIVE-15265
 URL: https://issues.apache.org/jira/browse/HIVE-15265
 Project: Hive
  Issue Type: Sub-task
Reporter: Wei Zheng


Since MM table is using the incremental "delta" insertion mechanism via ACID, 
it makes sense to make MM tables support snapshot isolation as well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15264) Not connect to hbase regionserver in standalone mode if changing regionserver file's content from localhost to not localhost

2016-11-22 Thread le anh duc (JIRA)
le anh duc created HIVE-15264:
-

 Summary: Not connect to hbase regionserver in standalone mode if 
changing regionserver file's content from localhost to not localhost
 Key: HIVE-15264
 URL: https://issues.apache.org/jira/browse/HIVE-15264
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 2.1.0
Reporter: le anh duc


Run HBase in standalone mode.
But we changed regionserver file:
changed content from "localhost" to not localhost, such as "test.server.com".
Run hiveserver2.
Run beeline and connect to hiveserver2.
Try to create an hbase table (map it to hbase).
We'll receive exception that hiveserver2 tried to connect to hbase regionserver 
but it can not received response from it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


HIVE-1555 discussion

2016-11-22 Thread Dmitry Zagorulkin
Hello!

I've implemented simple solution with some hard code by now.
It's tested with oracle database.

{code:sql}
beeline> !connect jdbc:hive2://localhost:1
Connecting to jdbc:hive2://localhost:1
Enter username for jdbc:hive2://localhost:1:
Enter password for jdbc:hive2://localhost:1:
Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:1>
0: jdbc:hive2://localhost:1> SET 
hive.metastore.warehouse.dir=${env:HOME}/Documents/hive-warehouse;
No rows affected (0.158 seconds)
0: jdbc:hive2://localhost:1>
0: jdbc:hive2://localhost:1> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_idINT,
. . . . . . . . . . . . . . . .>  book_name  STRING,
. . . . . . . . . . . . . . . .>  author_nameSTRING,
. . . . . . . . . . . . . . . .>  book_isbn  STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY 
"org.apache.hive.storagehandler.JDBCStorageHandler"
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = 
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = 
"jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .>  "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .>  "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );
No rows affected (2.297 seconds)
0: jdbc:hive2://localhost:1>
0: jdbc:hive2://localhost:1>
0: jdbc:hive2://localhost:1> select * from books3;
+-+---+-+---+
| books3.book_id  | books3.book_name  | books3.author_name  | books3.book_isbn  
|
+-+---+-+---+
| 124123  | name  | author  | 132321adsaf31 
|
| 13  | name2 | author2 | asd213fadsf   
|
| 2345236 | name3 | author3 | asdfds1234123 
|
+-+---+-+---+
3 rows selected (2.146 seconds)
0: jdbc:hive2://localhost:1> explain select * from books3;
++
|  Explain   |
++
| STAGE DEPENDENCIES:|
|   Stage-0 is a root stage  |
||
| STAGE PLANS:   |
|   Stage: Stage-0   |
| Fetch Operator |
|   limit: -1|
|   Processor Tree:  |
| TableScan  |
|   alias: books3|
|   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE |
|   Select Operator  |
| expressions: book_id (type: string), book_name (type: string), 
author_name (type: string), book_isbn (type: string) |
| outputColumnNames: _col0, _col1, _col2, _col3 |
| Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE |
| ListSink   |
||
++
17 rows selected (0.508 seconds)
{code}

This solution works with two steps:
1. First grab all meta info from external table 
2. Configure DBInputFormat, DBOutputFormat with table meta

What do you think about to ask user specify all needed information about 
columns and types inside serde properties section?

Smth like that:

0: jdbc:hive2://localhost:1> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_idINT,
. . . . . . . . . . . . . . . .>  book_name  STRING,
. . . . . . . . . . . . . . . .>  author_nameSTRING,
. . . . . . . . . . . . . . . .>  book_isbn  STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY 
“org.apache.hive.storagehandler.JDBCStorageHandler"
WITH SERDEPROPERTIES (
"hive.jdbc.columns.mapping" = 
“book_id:int(32), book_name:varchar(20), author_name:varchar(20), 
book_isbn:varchar(20)") 
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = 
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = 

[jira] [Created] (HIVE-15263) Detect the values for incorrect NULL values

2016-11-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15263:
---

 Summary: Detect the values for incorrect NULL values
 Key: HIVE-15263
 URL: https://issues.apache.org/jira/browse/HIVE-15263
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


We have seen the incorrect NULL values for SD_ID in TBLS for the hive tables. 
That column can be null since it will be NULL for hive views. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15262) Drill 1.8 UI doesn't display Hive join query results

2016-11-22 Thread Gopal Nagar (JIRA)
Gopal Nagar created HIVE-15262:
--

 Summary: Drill 1.8 UI doesn't display Hive join query results
 Key: HIVE-15262
 URL: https://issues.apache.org/jira/browse/HIVE-15262
 Project: Hive
  Issue Type: Bug
  Components: Hive, Web UI
Reporter: Gopal Nagar


Hi All,
I am using Apache Drill 1.8.0 on AWS EMR and joining two hive tables. Below is 
sample query. This working fine in Drill CLI but giving below error after 
running few minutes. If i try simple select query (select t1.col from 
hive.table t1) it works fine in both Drill CLI and UI. Only problem with join 
query.
If i cancel the join query from background, it displays results in UI. This is 
very strange situation.

Drill has been installed on a AWS node which has 32 GB RAM and 80 GB storage. I 
didn't specify memory separately to Drill. I am trying to join two tables have 
rows 4607818 & 14273378 respectively. Please find attached drillbit.log file as 
well.
Only my confusion here is Drill CLI is working fine with this data and showing 
output then why UI doesn't display output and throwing error. While running 
query from UI, if i cancel it from background then UI immediately display 
result.
I believe i am missing some configuration (which is holding o/p in buffer and 
after cancellation display result) here and need your help.

Join Query 
--
select t1.col FROM hive.table1 as t1 join hive.table2 as t2 on t1.col = t2.col 
limit 1000;

Error
---
Query Failed: An Error Occurred 
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
RpcException: Data not accepted downstream. Fragment 1:4 [Error Id: 
0b5ed2db-3653-4e3a-9c92-d0a6cd69b66e on 
ip-172-31-16-222.us-west-2.compute.internal:31010]





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 53983: HIVE-14582 : Add trunc(numeric) udf

2016-11-22 Thread chinnarao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53983/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

Overload trunc() function to accept numbers.

Now trunc() will accept date or number type arguments and it will behave as 
below

trunc(date, fmt) / trunc(N,D) - Returns

If input is date returns date with the time portion of the day truncated to the 
unit specified by the format model fmt. 
If you omit fmt, then date is truncated to "the nearest day. It now only 
supports 'MONTH'/'MON'/'MM' and 'YEAR'/''/'YY' as format.

If input is a number group returns N truncated to D decimal places. If D is 
omitted, then N is truncated to 0 places.
D can be negative to truncate (make zero) D digits left of the decimal point.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java 
e20ad65 
  ql/src/test/queries/clientpositive/udf_trunc_number.q PRE-CREATION 
  ql/src/test/results/clientpositive/udf_trunc.q.out 4c9f76d 
  ql/src/test/results/clientpositive/udf_trunc_number.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/53983/diff/


Testing
---

All tests are pass.


Thanks,

chinna



[jira] [Created] (HIVE-15261) Exception in thread "main" java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.0.0-alpha1

2016-11-22 Thread R (JIRA)
R created HIVE-15261:


 Summary: Exception in thread "main" 
java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 
3.0.0-alpha1
 Key: HIVE-15261
 URL: https://issues.apache.org/jira/browse/HIVE-15261
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.1
Reporter: R






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Permissions to edit pages

2016-11-22 Thread Lefty Leverenz
Anishek, you need a Confluence username as described here:  How to get
permission to edit

.

-- Lefty


On Mon, Nov 21, 2016 at 8:16 PM, Anishek Agarwal 
wrote:

> Hello,
>
> Can I please have the permission to edit the pages for Hive @
> https://cwiki.apache.org/confluence/display/Hive/Home
>
> Regards,
> Anishek
>


[jira] [Created] (HIVE-15260) Auto delete old log files of hive services

2016-11-22 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-15260:


 Summary: Auto delete old log files of hive services
 Key: HIVE-15260
 URL: https://issues.apache.org/jira/browse/HIVE-15260
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Reporter: Thejas M Nair


Hive log4j settings rotate the old log files by date, but they don't delete the 
old log files.
It would be good to delete the old log files so that the space used doesn't 
keep increasing for ever.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15259) The deserialization time of HOS20 is longer than what in HOS16

2016-11-22 Thread liyunzhang_intel (JIRA)
liyunzhang_intel created HIVE-15259:
---

 Summary: The deserialization time of HOS20 is longer than what in  
HOS16
 Key: HIVE-15259
 URL: https://issues.apache.org/jira/browse/HIVE-15259
 Project: Hive
  Issue Type: Improvement
Reporter: liyunzhang_intel


deploy Hive on Spark on spark 1.6 version and spark 2.0 version.
run query and in latest code(with spark2.0) the deserialization time of a task 
is 4 sec while the deserialization time of spark1.6 is 1 sec. The detail is in 
attached picture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)