[jira] [Created] (HIVE-20644) Avoid exposing sensitive infomation through an error message

2018-09-26 Thread Ashutosh Bapat (JIRA)
Ashutosh Bapat created HIVE-20644:
-

 Summary: Avoid exposing sensitive infomation through an error 
message
 Key: HIVE-20644
 URL: https://issues.apache.org/jira/browse/HIVE-20644
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Ashutosh Bapat
Assignee: Ashutosh Bapat


The HiveException raised from the following methods is exposing the datarow the 
caused the run time exception.
 # ReduceRecordSource::GroupIterator::next() - around line 372
 # MapOperator::process() - around line 567
 # ExecReducer::reduce() - around line 243

In all the cases, a string representation of the row is constructed on the fly 
and is included in
the error message.

VectorMapOperator::process() - around line 973 raises the same exception but 
it's not exposing the row since the row contents are not included in the error 
message.

While trying to reproduce above error, I also found that the arguments to a UDF 
get exposed in log messages from FunctionRegistry::invoke() around line 1114. 
This too can cause sensitive information to be leaked through error message.

This way some sensitive information is leaked to a user through exception 
message. That information may not be available to the user otherwise. Hence 
it's a kind of security breach or violation of access control.

The contents of the row or the arguments to a function may be useful for 
debugging and hence it's worth to add those to logs. Hence proposal here to log 
a separate message with log level DEBUG or INFO containing the string 
representation of the row. Users can configure their logging so that DEBUG/INFO 
messages do not go to the client but at the same time are available in the hive 
server logs for debugging. The actual exception message will not contain any 
sensitive data like row data or argument data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68828: HIVE-20601 : EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-26 Thread Alexander Kolbasov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68828/#review209064
---


Ship it!




Ship It!

- Alexander Kolbasov


On Sept. 24, 2018, 8:42 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68828/
> ---
> 
> (Updated Sept. 24, 2018, 8:42 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  f52ff91a8f2e7710801dcadc4a83ce454992a66a 
> 
> 
> Diff: https://reviews.apache.org/r/68828/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



[GitHub] hive pull request #438: HIVE-20632: Query with get_splits UDF fails if mater...

2018-09-26 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/438


---


[jira] [Created] (HIVE-20643) hive job was hang when execute aggregate calculation such as max()、count()

2018-09-26 Thread vincentzhao (JIRA)
vincentzhao created HIVE-20643:
--

 Summary: hive job was hang when execute aggregate calculation such 
as max()、count()
 Key: HIVE-20643
 URL: https://issues.apache.org/jira/browse/HIVE-20643
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: vincentzhao


if i run "select * from table limit 10" in the hive, it can return successful. 
but when i was run "select max() from table" ,it is hang.

*the output massage :---*

Logging initialized using configuration in 
file:/app/apache-hive-2.3.0-bin/conf/hive-log4j2.properties Async: true
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. spark, tez) 
or using Hive 1.X releases.
Query ID = root_20180927085541_71b0e652-ad29-4a2f-ac53-1e9a4f34cec1
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=
In order to set a constant number of reducers:
 set mapreduce.job.reduces=
Cannot run job locally: Input Size (= 562011785) is larger than 
hive.exec.mode.local.auto.inputbytes.max (= 25600)。

 

*The hive.log 
:-*


"new HiveConf()"

2018-09-27T08:55:46,192 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
lockmgr.DbLockManager (DbLockManager.java:lock(104)) - Response to 
queryId=root_20180927085541_71b0e652-ad29-4a2f-ac53-1e9a4f34cec1 
LockResponse(lockid:19039, state:ACQUIRED)
2018-09-27T08:55:46,193 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(507)) - Started 
heartbeat with delay/interval = 15/15 MILLISECONDS for query: 
root_20180927085541_71b0e652-ad29-4a2f-ac53-1e9a4f34cec1
2018-09-27T08:55:46,193 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (Driver.java:execute(1734)) - Executing 
command(queryId=root_20180927085541_71b0e652-ad29-4a2f-ac53-1e9a4f34cec1): 
select count(*) from tick2
2018-09-27T08:55:46,195 WARN [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (Driver.java:logMrWarning(2094)) - Hive-on-MR is deprecated in Hive 2 
and may not be available in the future versions. Consider using a different 
execution engine (i.e. spark, tez) or using Hive 1.X releases.
2018-09-27T08:55:46,195 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (SessionState.java:printInfo()) - WARNING: Hive-on-MR is 
deprecated in Hive 2 and may not be available in the future versions. Consider 
using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
2018-09-27T08:55:46,197 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (SessionState.java:printInfo()) - Query ID = 
root_20180927085541_71b0e652-ad29-4a2f-ac53-1e9a4f34cec1
2018-09-27T08:55:46,197 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (SessionState.java:printInfo()) - Total jobs = 1
2018-09-27T08:55:46,209 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (SessionState.java:printInfo()) - Launching Job 1 out of 1
2018-09-27T08:55:46,211 INFO [5cf99469-4f8e-49b1-9108-b94b3928a2df main]: 
ql.Driver (Driver.java:launchTask(2174)) - Starting task [Stage-1:MAPRED] in 
parallel
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - Number of reduce tasks determined at 
compile time: 1
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - In order to change the average load for a 
reducer (in bytes):
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - set 
hive.exec.reducers.bytes.per.reducer=
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - In order to limit the maximum number of 
reducers:
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - set hive.exec.reducers.max=
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - In order to set a constant number of 
reducers:
2018-09-27T08:55:46,212 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - set mapreduce.job.reduces=
2018-09-27T08:55:46,225 INFO [Thread-6]: exec.Task 
(SessionState.java:printInfo()) - Cannot run job locally: Input Size (= 
562011785) is larger than hive.exec.mode.local.auto.inputbytes.max (= 25600)
2018-09-27T08:55:46,231 INFO [Thread-6]: ql.Context 
(Context.java:getMRScratchDir(454)) - New scratch dir is 
hdfs://cluster/app/apache-hive-2.3.0-bin/tmp/hive-root/root/5cf99469-4f8e-49b1-9108-b94b3928a2df/hive_2018-09-27_08-55-41_865_3811548964141108768-2
2018-09-27T08:55:46,237 INFO 

[jira] [Created] (HIVE-20642) Add Tests for HIVE-12812

2018-09-26 Thread Alice Fan (JIRA)
Alice Fan created HIVE-20642:


 Summary: Add Tests for HIVE-12812
 Key: HIVE-20642
 URL: https://issues.apache.org/jira/browse/HIVE-20642
 Project: Hive
  Issue Type: Test
Affects Versions: 4.0.0
Reporter: Alice Fan
Assignee: Alice Fan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68827: Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-26 Thread Alexander Kolbasov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68827/#review209055
---



Good overall, some small things below.


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 917 (patched)


I think a better name would be something like truncateMapByKey.

Can you guarantee that it always receives non-null map? Then you don't need 
to check for null.

Since you don't care about values, can this be a generic map ``?



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 918 (patched)


You already have this check when you call this function.



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 928 (patched)


Looks like this can be a map with any value.



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 931 (patched)


Naming - may be something like 'truncateMapByKeys'?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
Line 297 (original), 310 (patched)


The default exclude filter is empty. When someone sets this variable they 
should understand what they are doing.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
Lines 37 (patched)


Can you add a test with an empty exclude pattern?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
Lines 44 (patched)


It would be better to use hamcrest asserts and just check that you map 
matches your expected map with a single assert.


- Alexander Kolbasov


On Sept. 24, 2018, 8:37 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68827/
> ---
> 
> (Updated Sept. 24, 2018, 8:37 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  30ea7f8129 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  c681a87a1c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  2668b05320 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68827/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 68828: HIVE-20601 : EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-26 Thread Bharathkrishna Guruvayoor Murali via Review Board


> On Sept. 26, 2018, 7:24 p.m., Alexander Kolbasov wrote:
> > Are there any other cases where environment context isn't passed to the 
> > listener?

In other cases, it is being passed to listener


> On Sept. 26, 2018, 7:24 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
> > Line 746 (original), 746 (patched)
> > 
> >
> > Looks like formatting is off here

I think formatting is correct here, no bugs according to checkstyle


- Bharathkrishna


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68828/#review209036
---


On Sept. 24, 2018, 8:42 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68828/
> ---
> 
> (Updated Sept. 24, 2018, 8:42 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  f52ff91a8f2e7710801dcadc4a83ce454992a66a 
> 
> 
> Diff: https://reviews.apache.org/r/68828/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 68827: Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-26 Thread Alexander Kolbasov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68827/#review209037
---



Please put JIRA ID in the header and also in the Bug field on the right of the 
review board.

- Alexander Kolbasov


On Sept. 24, 2018, 8:37 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68827/
> ---
> 
> (Updated Sept. 24, 2018, 8:37 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  30ea7f8129 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  c681a87a1c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  2668b05320 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68827/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 68828: HIVE-20601 : EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-26 Thread Alexander Kolbasov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68828/#review209036
---



Are there any other cases where environment context isn't passed to the 
listener?


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
Line 746 (original), 746 (patched)


Looks like formatting is off here


- Alexander Kolbasov


On Sept. 24, 2018, 8:42 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68828/
> ---
> 
> (Updated Sept. 24, 2018, 8:42 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  f52ff91a8f2e7710801dcadc4a83ce454992a66a 
> 
> 
> Diff: https://reviews.apache.org/r/68828/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



[jira] [Created] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)
Deepak Jaiswal created HIVE-20641:
-

 Summary: load_data_using_job is failing
 Key: HIVE-20641
 URL: https://issues.apache.org/jira/browse/HIVE-20641
 Project: Hive
  Issue Type: Bug
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal


load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-20640:
-

 Summary: Upgrade Hive to use ORC 1.5.3
 Key: HIVE-20640
 URL: https://issues.apache.org/jira/browse/HIVE-20640
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68827: Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-26 Thread Andrew Sherman via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68827/#review209031
---



Code looks clean, but I have some scary questions.


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 517 (patched)


Make this clearer.
A list of comma separated regeses that are used to reduced the size of Hms 
Notifiaction messages.
If a partition name (?) or table name (?) matches a regex then the  
partition or table is excluded from the notification.
Or something



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 1412 (patched)


Add javadoc



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 913 (patched)


Good doumentation!
Are you 100% sure that this Map is never sused by anyone else? What about 
future code?



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 917 (patched)


make private



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
Lines 106 (patched)


So what if I make a miskae in a regex in config? This code will fail, what 
will happen then? How will I know what failed if my server won't start?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
Line 297 (original), 310 (patched)


This is used by the notifications that (we think) we understand, but it is 
also used by JSONAcidWriteMessage. So what happens if someone uses your new 
mechanism to reduce the size of messages, but affects JSONAcidWriteMessage? In 
other words there could be multile uses for notifications in a complex system, 
and this mechanism affects them all.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
Lines 35 (patched)


Add javadoc



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
Lines 36 (patched)


We already have TestMetaStoreServerUtils, does this need a new file?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
Lines 66 (patched)


add test with the default param map from MetastoreConf


- Andrew Sherman


On Sept. 24, 2018, 8:37 p.m., Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68827/
> ---
> 
> (Updated Sept. 24, 2018, 8:37 p.m.)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  30ea7f8129 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  c681a87a1c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  2668b05320 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68827/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



[jira] [Created] (HIVE-20639) Add ability to Write Data from Hive Table/Query to Kafka Topic

2018-09-26 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-20639:
-

 Summary: Add ability to Write Data from Hive Table/Query to Kafka 
Topic
 Key: HIVE-20639
 URL: https://issues.apache.org/jira/browse/HIVE-20639
 Project: Hive
  Issue Type: New Feature
  Components: kafka integration
Reporter: slim bouguerra
Assignee: slim bouguerra


This patch adds multiple record writers to allow Hive user writing data 
directly to a Kafka Topic.
The writer provides multiple write semantics modes.
* A None where all the records will be delivered with no guarantee or reties.
* B At_least_once, each record will be delivered with retries from the Kafka 
Producer and Hive Write Task. 
* C Exactly_once , Writer will be using Kafka Transaction API to ensure that 
each record is delivered once.

In addition to the new feature i have refactored the existing code to make it 
more readable.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 68852: HIVE-20636

2018-09-26 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68852/
---

Review request for hive, Ashutosh Chauhan and Vineet Garg.


Bugs: HIVE-20636
https://issues.apache.org/jira/browse/HIVE-20636


Repository: hive-git


Description
---

HIVE-20636


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 4c5695c68a4a19587b15d5512b900785ea49d6ef 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_14.q.out 
ced0f6868ed02bcdfefab9bcfc33b59673502f74 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_15.q.out 
661538700498938983812a1d09e4be2dcdc47e26 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_16.q.out 
c02546ae5205751f212266987e35702157f85c1a 
  ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out 
771aa6d7dd97b858b2613300737ca6a4e32bcfc6 
  ql/src/test/results/clientpositive/llap/check_constraint.q.out 
357dd9f8d6c6a917530ea9807b9214657aa99cc5 
  ql/src/test/results/clientpositive/llap/correlationoptimizer1.q.out 
224d980f81b81dd8a9f4a80ddaea17c6e6f7d08a 
  ql/src/test/results/clientpositive/llap/correlationoptimizer2.q.out 
3973000ffa2c8cc9f2118a86f23072d397fc4ec9 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out 
889c2ad5327b3eeae5ed0e13d66b17b908d4ad5c 
  ql/src/test/results/clientpositive/llap/insert_into_default_keyword.q.out 
4f4d1b11ddc9c7e7e2d38036a5a7c7f03ee4d2f0 
  ql/src/test/results/clientpositive/llap/join32_lessSize.q.out 
aba0c247044730c962a70ea5bbb3a96cfd68a634 
  ql/src/test/results/clientpositive/llap/join46.q.out 
81d9dbf8236fce1ca5498dae5fbe9d96937a879c 
  ql/src/test/results/clientpositive/llap/join_emit_interval.q.out 
2c64ce853cb89ab4b86b4173e6aed39a5dcba4de 
  ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out 
9d06dd87e7ab4f8315e8187b3a225956ab33 
  ql/src/test/results/clientpositive/llap/llap_smb_ptf.q.out 
c7b7d7068f36021abb3075fa15895de9739a9294 
  ql/src/test/results/clientpositive/llap/load_data_using_job.q.out 
8a824678c2c5b61c44a56f0840e31578e39334ba 
  ql/src/test/results/clientpositive/llap/mapjoin46.q.out 
b109e4ed3d4e101493a6aa9703d046cef6ca451a 
  ql/src/test/results/clientpositive/llap/mapjoin_emit_interval.q.out 
aaddb99e8a41e1a9904547aac92115d36411baee 
  ql/src/test/results/clientpositive/llap/subquery_in.q.out 
c00fadcfca6d02b859024e5d22fc2d0d04d991fb 
  ql/src/test/results/clientpositive/llap/subquery_multi.q.out 
e9a355883dc8400e7539ccd2d30a59ab2c542abb 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out 
64df788e8bb710299a62cc7e34215924e2033784 
  ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 
64b1067ab0f464a68d8aac8c309c1bd9c514c9c3 
  ql/src/test/results/clientpositive/llap/subquery_select.q.out 
30c6d13efc8ff9819ebecc6ae3ca808a10138c04 
  ql/src/test/results/clientpositive/llap/tez_dynpart_hashjoin_3.q.out 
6b801fb790ea920ca8bdfa72426927e1eae44909 
  ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out 
9d16a8d3024380cd41869d9b64967183e4eec841 
  ql/src/test/results/clientpositive/llap/tez_join_tests.q.out 
4471f04f41aa5b11489190aaae7f138e358808ee 
  ql/src/test/results/clientpositive/llap/tez_joins_explain.q.out 
08a2215d4cd2697d1f6714d677e169657c727465 
  ql/src/test/results/clientpositive/llap/tez_smb_empty.q.out 
b529bbbafddb2e44ec889e852897147fb2f0a4c9 
  ql/src/test/results/clientpositive/llap/unionDistinct_1.q.out 
e474c28ac7f9f69db646507d2f19f61472d57f46 
  ql/src/test/results/clientpositive/llap/vector_coalesce_3.q.out 
57f3892a693a8c0ea971f20793bd8095f1f6cf8c 
  ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out 
1309bf658a0a1228a05cd5b2ce727df8b345085c 
  ql/src/test/results/clientpositive/llap/vector_outer_join0.q.out 
49dd4ff18718c3c091a3daff17e1caed689b3371 
  ql/src/test/results/clientpositive/llap/vectorized_join46.q.out 
9e9d78e6fb7c6485fd27e5c244c4f7e2ab6dbb68 
  ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 
2c1b7088abf053d7e5615e989912ba84ce33544b 


Diff: https://reviews.apache.org/r/68852/diff/1/


Testing
---


Thanks,

Jesús Camacho Rodríguez



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread denys kuzmenko via Review Board


> On Sept. 26, 2018, 11:47 a.m., Antal Sinkovits wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3062 (patched)
> > 
> >
> > Why is the default value -1? All the checks seems to go against >0. 
> > What happens when the value is 0?

If the degree of parallelism is lower than 1, there won't be any restrictions 
on number of parallel compilations. "-1" means unbounded. "0" doesn't make 
sence, so it's just ignored (and will be unbounded).  Otherwise, number of 
quotas will be equal to the "hive.driver.parallel.compilation.global.limit" 
value.


> On Sept. 26, 2018, 11:47 a.m., Antal Sinkovits wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
> > Lines 12 (patched)
> > 
> >
> > I think there is a typo here, and the it should be CompileLock.class.

Thanks for the catch!


- denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review209017
---


On Sept. 26, 2018, 1:08 p.m., denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated Sept. 26, 2018, 1:08 p.m.)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/7/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread denys kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/
---

(Updated Sept. 26, 2018, 1:08 p.m.)


Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, and 
Peter Vary.


Bugs: HIVE-20535
https://issues.apache.org/jira/browse/HIVE-20535


Repository: hive-git


Description
---

When removing the compile lock, it is quite risky to remove it entirely.

It would be good to provide a pool size for the concurrent compilation, so the 
administrator can limit the load


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 


Diff: https://reviews.apache.org/r/68683/diff/7/

Changes: https://reviews.apache.org/r/68683/diff/6-7/


Testing
---

Added CompileLockTest


File Attachments


HIVE-20535.1.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
HIVE-20535.14.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch


Thanks,

denys kuzmenko



[GitHub] hive pull request #434: HIVE-20607: TxnHandler should use PreparedStatement ...

2018-09-26 Thread sankarh
Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/434


---


Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-26 Thread Antal Sinkovits via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review209017
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 3062 (patched)


Why is the default value -1? All the checks seems to go against >0. What 
happens when the value is 0?



ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
Lines 12 (patched)


I think there is a typo here, and the it should be CompileLock.class.


- Antal Sinkovits


On szept. 25, 2018, 10:19 de, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 25, 2018, 10:19 de)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/6/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68836: HIVE-17917 VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-26 Thread Saurabh Seth

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68836/
---

(Updated Sept. 26, 2018, 10:06 a.m.)


Review request for hive and Eugene Koifman.


Changes
---

Fixed the checkstyle and findbugs errors. These were pre-existing ones but 
because they were on lines where I made minor changes, they were reported as 
new ones. These were trivial ones, so I fixed them:

./ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java:266:
return new OrcSplit.OffsetAndBucketProperty(-1,-1, 
syntheticTxnInfo.syntheticWriteId);:56: warning: ',' is not followed by 
whitespace.

Redundant null check at VectorizedOrcAcidRowBatchReader.java:[line 495]


Bugs: HIVE-17917
https://issues.apache.org/jira/browse/HIVE-17917


Repository: hive-git


Description
---

VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
computation is currently (after HIVE-17458) is done once per split. It could 
instead be done once per file (since the result is the same for each split of 
the same file) and passed along in OrcSplit


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java f34f393fb8 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java bce7977929 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 1841cfaa2e 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
208aeb5b1f 
  ql/src/test/queries/clientpositive/acid_vectorization_original.q 5082aedf90 
  ql/src/test/results/clientpositive/llap/acid_vectorization_original.q.out 
99c741c7bd 


Diff: https://reviews.apache.org/r/68836/diff/2/

Changes: https://reviews.apache.org/r/68836/diff/1-2/


Testing
---


Thanks,

Saurabh Seth



Review Request 68850: Allow any udfs with 0 arguments or with constant arguments as part of default clause

2018-09-26 Thread Miklos Gergely

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68850/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-20637
https://issues.apache.org/jira/browse/HIVE-20637


Repository: hive-git


Description
---

Allow any udfs with 0 arguments or with constant arguments as part of default 
clause

Also removed some unused code.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java b655ab1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 412fca2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 344e9fc 
  
ql/src/test/queries/clientnegative/default_constraint_invalid_default_value2.q 
ec5b67a 
  
ql/src/test/queries/clientnegative/default_constraint_invalid_default_value_type.q
 1f1a9db 
  ql/src/test/queries/clientpositive/insert_into_default_keyword.q ebef1a4 
  
ql/src/test/results/clientnegative/default_constraint_invalid_default_value2.q.out
 76e5aeb 
  
ql/src/test/results/clientnegative/default_constraint_invalid_default_value_type.q.out
 61e0a2f 
  ql/src/test/results/clientpositive/llap/insert_into_default_keyword.q.out 
4f4d1b1 


Diff: https://reviews.apache.org/r/68850/diff/1/


Testing
---

Tested on local cluster. Added a new section in a q test, and removed those 
test that were testing especially this not to happen.


Thanks,

Miklos Gergely



[jira] [Created] (HIVE-20638) Upgrade version of Jetty to 9.3.25.v20180904

2018-09-26 Thread Laszlo Bodor (JIRA)
Laszlo Bodor created HIVE-20638:
---

 Summary: Upgrade version of Jetty to 9.3.25.v20180904
 Key: HIVE-20638
 URL: https://issues.apache.org/jira/browse/HIVE-20638
 Project: Hive
  Issue Type: Bug
Reporter: Laszlo Bodor






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20637) Allow any udfs with 0 arguments or with constant arguments as part of default clause

2018-09-26 Thread Miklos Gergely (JIRA)
Miklos Gergely created HIVE-20637:
-

 Summary: Allow any udfs with 0 arguments or with constant 
arguments as part of default clause
 Key: HIVE-20637
 URL: https://issues.apache.org/jira/browse/HIVE-20637
 Project: Hive
  Issue Type: Task
  Components: Hive
Affects Versions: 3.0.1
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 3.0.1






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 68848: HIV E-20540

2018-09-26 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68848/
---

Review request for hive and Gopal V.


Bugs: HIVE-20540
https://issues.apache.org/jira/browse/HIVE-20540


Repository: hive-git


Description
---

Vectorization : Support loading bucketed tables using sorted dynamic partition 
optimizer - II

Followup to HIVE-20510 with remaining issues

1. Avoid using Reflection.
2. In VectorizationContext, use correct place to setup the VectorExpression. It 
may be missed in certain cases.
3. In BucketNumExpression, make sure that a value is not overwritten before it 
is processed. Use a flag to achieve this.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
55d2a16f03 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/BucketNumExpression.java
 d8c696c302 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 1a8395a71b 


Diff: https://reviews.apache.org/r/68848/diff/1/


Testing
---


Thanks,

Deepak Jaiswal