date:20170420

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Attachment: (was: HIVE-16058.2.patch)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Status: Patch Available  (was: Open)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16311) Improve the performance for FastHiveDecimalImpl.fastDivide

2017-04-20 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-16311:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to the master. Thanks [~colin_mjj] for the contribution and [~xuefuz] 
[~mmccline] for the review.

> Improve the performance for FastHiveDecimalImpl.fastDivide
> --
>
> Key: HIVE-16311
> URL: https://issues.apache.org/jira/browse/HIVE-16311
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
> Fix For: 3.0.0
>
> Attachments: HIVE-16311.001.patch, HIVE-16311.002.patch, 
> HIVE-16311.003.patch, HIVE-16311.004.patch, HIVE-16311.005.patch, 
> HIVE-16311.006.patch, HIVE-16311.007.patch, HIVE-16311.008.patch, 
> HIVE-16311.withTrailingZero.patch
>
>
> FastHiveDecimalImpl.fastDivide is poor performance when evaluate the 
> expression as 12345.67/123.45
> There are 2 points can be improved:
> 1. Don't always use HiveDecimal.MAX_SCALE as scale when do the 
> BigDecimal.divide.
> 2. Get the precision for BigInteger in a fast way if possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16083) Hive should escape fields terminator when use textfile format

2017-04-20 Thread Anoop S Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop S Nair reassigned HIVE-16083:
---

Assignee: Anoop S Nair

> Hive should escape fields terminator when use textfile format
> -
>
> Key: HIVE-16083
> URL: https://issues.apache.org/jira/browse/HIVE-16083
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 2.1.1
>Reporter: Winston Churchill
>Assignee: Anoop S Nair
>Priority: Trivial
>
> Create table with comon as fields terminator and insert data contains comon:
> {code}
> create table test_1(id int,name string) row format delimited fields 
> terminated by ',' stored as textfile;
> insert into table test_1 values (1,'a,b,c')；
> {code}
> {code}
> select * from test_1;
> ++--+--+
> | test_1.id  | test_1.name  |
> ++--+--+
> | 1  | a|
> ++--+--+
> 4 rows selected (0.363 seconds)
> {code}
> May need to escape the fields terminator ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-04-20 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978044#comment-15978044
 ] 

Rui Li commented on HIVE-16047:
---

[~andrew.wang] thanks for the information. Is there any public API to achieve 
the same function?
Users have found the log quite confusing and annoying. And the "breaking" 
change is in unreleased Hadoop. [~xuefuz], [~Ferd], [~sershe] do you think we 
should revert the patch? Or maybe we can revert in master and keep it in 
branch-2.x?

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Status: Open  (was: Patch Available)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Attachment: (was: HIVE-16058.2.patch)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Attachment: HIVE-16058.2.patch

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Status: Patch Available  (was: Open)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16058) Disable falling back to non-cbo for SemanticException for tests

2017-04-20 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16058:

Attachment: HIVE-16058.2.patch

Found following issues : HIVE-16494 HIVE-16491 HIVE-16492 
Will use those jiras for fixing bugs. This one for to fail queries in test mode 
(original intent)

> Disable falling back to non-cbo for SemanticException for tests
> ---
>
> Key: HIVE-16058
> URL: https://issues.apache.org/jira/browse/HIVE-16058
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16058.1.patch, HIVE-16058.2.patch
>
>
> Currently optimizer falls back to non-cbo path if cbo path throws an 
> exception of type SemanticException. This might be eclipsing some genuine 
> issues within cbo-path.
> We would like to turn off the fall back mechanism for tests to see if there 
> are indeed genuine issues/bugs within cbo path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-12636) Ensure that all queries (with DbTxnManager) run in a transaction

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12636:
--
Attachment: HIVE-12636.12.patch

> Ensure that all queries (with DbTxnManager) run in a transaction
> 
>
> Key: HIVE-12636
> URL: https://issues.apache.org/jira/browse/HIVE-12636
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-12636.01.patch, HIVE-12636.02.patch, 
> HIVE-12636.03.patch, HIVE-12636.04.patch, HIVE-12636.05.patch, 
> HIVE-12636.06.patch, HIVE-12636.07.patch, HIVE-12636.09.patch, 
> HIVE-12636.10.patch, HIVE-12636.12.patch
>
>
> Assuming Hive is using DbTxnManager
> Currently (as of this writing only auto commit mode is supported), only 
> queries that write to an Acid table start a transaction.
> Read-only queries don't open a txn but still acquire locks.
> This makes internal structures confusing/odd.
> The are constantly 2 code paths to deal with which is inconvenient and error 
> prone.
> Also, a txn id is convenient "handle" for all locks/resources within a txn.
> Doing thing would mean the client no longer needs to track locks that it 
> acquired.  This enables further improvements to metastore side of Acid.
> # add metastore call to openTxn() and acquireLocks() in a single call.  this 
> it to make sure perf doesn't degrade for read-only query.  (Would also be 
> useful for auto commit write queries)
> # Should RO queries generate txn ids from the same sequence?  (they could for 
> example use negative values of a different sequence).  Txnid is part of the 
> delta/base file name.  Currently it's 7 digits.  If we use the same sequence, 
> we'll exceed 7 digits faster. (possible upgrade issue).  On the other hand 
> there is value in being able to pick txn id and commit timestamp out of the 
> same logical sequence.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16495:
--


> ColumnStats merge should consider the accuracy of the current stats
> ---
>
> Key: HIVE-16495
> URL: https://issues.apache.org/jira/browse/HIVE-16495
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16493) Skip column stats when colStats is empty

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16493:
---
Attachment: HIVE-16493.01.patch

> Skip column stats when colStats is empty
> 
>
> Key: HIVE-16493
> URL: https://issues.apache.org/jira/browse/HIVE-16493
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16493.01.patch
>
>
> Otherwise it will throw NPE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16493) Skip column stats when colStats is empty

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16493:
---
Status: Patch Available  (was: Open)

> Skip column stats when colStats is empty
> 
>
> Key: HIVE-16493
> URL: https://issues.apache.org/jira/browse/HIVE-16493
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16493.01.patch
>
>
> Otherwise it will throw NPE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16494) udaf percentile_approx() may fail on CBO

2017-04-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978014#comment-15978014
 ] 

Ashutosh Chauhan commented on HIVE-16494:
-

unit test in: udaf_percentile_approx_23.q

> udaf percentile_approx() may fail on CBO
> 
>
> Key: HIVE-16494
> URL: https://issues.apache.org/jira/browse/HIVE-16494
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer, UDF
>Reporter: Ashutosh Chauhan
>
> select percentile_approx(key, array(0.50, 0.70, 0.90, 0.95, 0.99)) from t; 
> fails with error : The second argument must be a constant. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16493) Skip column stats when colStats is empty

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16493:
---
Description: Otherwise it will throw NPE

> Skip column stats when colStats is empty
> 
>
> Key: HIVE-16493
> URL: https://issues.apache.org/jira/browse/HIVE-16493
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Otherwise it will throw NPE



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16493) Skip column stats when colStats is empty

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16493:
--


> Skip column stats when colStats is empty
> 
>
> Key: HIVE-16493
> URL: https://issues.apache.org/jira/browse/HIVE-16493
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16492) Create view doesn't work with Calcite Return path

2017-04-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978007#comment-15978007
 ] 

Ashutosh Chauhan commented on HIVE-16492:
-

Unit test in cbo_rp_unionDistinct_2.q,cbo_rp_views.q,cbo_rp_windowing_2.q

> Create view doesn't work with Calcite Return path
> -
>
> Key: HIVE-16492
> URL: https://issues.apache.org/jira/browse/HIVE-16492
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>
> Support for AST path was added in HIVE-15769 but not for return path.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16491) CBO cant handle join involving complex types in on condition

2017-04-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978004#comment-15978004
 ] 

Ashutosh Chauhan commented on HIVE-16491:
-

Unit test in vector_complex_join.q

> CBO cant handle join involving complex types in on condition
> 
>
> Key: HIVE-16491
> URL: https://issues.apache.org/jira/browse/HIVE-16491
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.3.0
>Reporter: Ashutosh Chauhan
>
> Chokes on query like:
> {code}
>  select *  from test2b join test2a on test2b.a = test2a.a[1];
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15982) Support the width_bucket function

2017-04-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977996#comment-15977996
 ] 

Ashutosh Chauhan commented on HIVE-15982:
-

+1

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16233:

Attachment: HIVE-16233.02.patch

An updated patch after looking at 2 potential improvements.

> llap: Query failed with AllocatorOutOfMemoryException
> -
>
> Key: HIVE-16233
> URL: https://issues.apache.org/jira/browse/HIVE-16233
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16233.01.patch, HIVE-16233.02.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_56_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
> at 
>

[jira] [Updated] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16233:

Attachment: (was: HIVE-16233.patch)

> llap: Query failed with AllocatorOutOfMemoryException
> -
>
> Key: HIVE-16233
> URL: https://issues.apache.org/jira/browse/HIVE-16233
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16233.01.patch, HIVE-16233.02.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_56_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
> at 
>

[jira] [Updated] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16233:

Attachment: (was: HIVE-16233.WIP.patch)

> llap: Query failed with AllocatorOutOfMemoryException
> -
>
> Key: HIVE-16233
> URL: https://issues.apache.org/jira/browse/HIVE-16233
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16233.01.patch, HIVE-16233.02.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_56_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
> at 
>

[jira] [Updated] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16233:

Attachment: (was: HIVE-16233.WIP.patch)

> llap: Query failed with AllocatorOutOfMemoryException
> -
>
> Key: HIVE-16233
> URL: https://issues.apache.org/jira/browse/HIVE-16233
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16233.01.patch, HIVE-16233.02.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_56_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
> at 
>

[jira] [Updated] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16233:

Attachment: (was: HIVE-16233.WIP.patch)

> llap: Query failed with AllocatorOutOfMemoryException
> -
>
> Key: HIVE-16233
> URL: https://issues.apache.org/jira/browse/HIVE-16233
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16233.01.patch, HIVE-16233.02.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_56_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
> ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
> ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
> at 
>

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977939#comment-15977939
 ] 

Rui Li commented on HIVE-11133:
---

Another example of non-root stage having vertex dependency is conditional task. 
Suppose you disable {{hive.auto.convert.join}} and enable 
{{hive.optimize.skewjoin}}. The following query will generate conditional task 
to join skewed data:
{code}
select A.key from A join B on A.key=B.key group by A.key;
{code}

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977931#comment-15977931
 ] 

Rui Li commented on HIVE-11133:
---

[~stakiar], the plan in your last comment doesn't seem correct. Looking at the 
"raw query plan", Stage-2 is the root stage which only contains Map 4. However, 
in the explain output, the vertex dependency is for Stage-1, although it says 
it's for root stage.

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15982) Support the width_bucket function

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15982:

Attachment: HIVE-15982.5.patch

Updated patch, includes tests for negative expr, min, and max values.

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15982) Support the width_bucket function

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15982:

Attachment: HIVE-15982.4.patch

Attaching updated patch that has better handling of {{null}}s. It occurred to 
me that there are no tests with negative integer values, I'm working on adding 
those now.

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch, HIVE-15982.4.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977912#comment-15977912
 ] 

Sahil Takiar edited comment on HIVE-11133 at 4/21/17 1:29 AM:
--

[~xuefuz], [~lirui]

The qtest in the patch has a very similar query:

{code}
select sum(hash(a.k1,a.v1,a.k2, a.v2))
from (
select src1.key as k1, src1.value as v1, 
   src2.key as k2, src2.value as v2 FROM 
  (select * FROM src WHERE src.key < 10) src1 
JOIN 
  (select * FROM src WHERE src.key < 10) src2
  SORT BY k1, v1, k2, v2
) a
{code}

It's also a mapjoin. The user-level explain output is:

{code}
Plan not optimized by CBO.

Vertex dependency in root stage
Reducer 2 <- Map 1 (PARTITION-LEVEL SORT)
Reducer 3 <- Reducer 2 (GROUP)

Stage-0
  Fetch Operator
limit:-1
Stage-1
  Reducer 3
  File Output Operator [FS_17]
Group By Operator [GBY_15] (rows=1 width=8)
  Output:["_col0"],aggregations:["sum(VALUE._col0)"]
<-Reducer 2 [GROUP]
  GROUP [RS_14]
Group By Operator [GBY_13] (rows=1 width=8)
  
Output:["_col0"],aggregations:["sum(hash(_col0,_col1,_col2,_col3))"]
  Select Operator [SEL_11] (rows=27556 width=22)
Output:["_col0","_col1","_col2","_col3"]
  <-Map 1 [PARTITION-LEVEL SORT]
PARTITION-LEVEL SORT [RS_10]
  Map Join Operator [MAPJOIN_20] (rows=27556 width=22)
Conds:(Inner),Output:["_col0","_col1","_col2","_col3"]
  <-Select Operator [SEL_2] (rows=166 width=10)
  Output:["_col0","_col1"]
  Filter Operator [FIL_18] (rows=166 width=10)
predicate:(key < 10)
TableScan [TS_0] (rows=500 width=10)
  
default@src,src,Tbl:COMPLETE,Col:NONE,Output:["key","value"]
Map Reduce Local Work
Stage-2
  Map 4
  keys: [HASHTABLESINK_22]
Select Operator [SEL_5] (rows=166 width=10)
  Output:["_col0","_col1"]
  Filter Operator [FIL_19] (rows=166 width=10)
predicate:(key < 10)
TableScan [TS_3] (rows=500 width=10)
  default@src,src,Tbl:COMPLETE,Col:NONE,Output:["key","value"]
  Map Reduce Local Work
{code}

The raw query plan looks like:

{code}
{
  "STAGE DEPENDENCIES": {
"Stage-2": {
  "ROOT STAGE": "TRUE"
},
"Stage-1": {
  "DEPENDENT STAGES": "Stage-2"
},
"Stage-0": {
  "DEPENDENT STAGES": "Stage-1"
}
  },
  "STAGE PLANS": {
"Stage-2": {
  "Spark": {
"Vertices:": {
  "Map 2": {
"Map Operator Tree:": [
  {
"TableScan": {
  "Output:": [
"key",
"value"
  ],
  "_empty_": "default@myinput1,b,Tbl:COMPLETE,Col:NONE",
  "Statistics:": "rows=3 width=8",
  "OperatorId:": "TS_1",
  "children": {
"keys:": {
  "0": "key",
  "1": "value",
  "OperatorId:": "HASHTABLESINK_10"
}
  }
}
  }
],
"Local Work:": {
  "Map Reduce Local Work": {

  }
},
"tag:": "0"
  }
}
  }
},
"Stage-1": {
  "Spark": {
"Vertices:": {
  "Map 1": {
"Map Operator Tree:": [
  {
"TableScan": {
  "Output:": [
"key",
"value"
  ],
  "_empty_": "default@myinput1,a,Tbl:COMPLETE,Col:NONE",
  "Statistics:": "rows=3 width=8",
  "OperatorId:": "TS_0",
  "children": {
"Map Join Operator": {
  "condition map:": [
{
  "_empty_": 
"{\"type\":\"Inner\",\"left\":0,\"right\":1}"
}
  ],
  "input vertices:": {
"1": "Map 2"
  },
  "keys:": {
"0": "key",
"1": "value"
  },
  "Output:": [
"_col0",
"_col1",
"_col5",
"_col6"
  ],
  "Statistics:": "rows=3 width=9",
  "OperatorId:": "MAPJOIN_7",
  "children": {
"Select Operator": {
  "Output:": [
"_col0",

[jira] [Commented] (HIVE-15795) Support Accumulo Index Tables in Hive Accumulo Connector

2017-04-20 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977911#comment-15977911
 ] 

Josh Elser commented on HIVE-15795:
---

Thanks for the commit, Sergey!

Great job on this, Mike!

> Support Accumulo Index Tables in Hive Accumulo Connector
> 
>
> Key: HIVE-15795
> URL: https://issues.apache.org/jira/browse/HIVE-15795
> Project: Hive
>  Issue Type: Improvement
>  Components: Accumulo Storage Handler
>Reporter: Mike Fagan
>Assignee: Mike Fagan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15795.1.patch, HIVE-15795.2.patch
>
>
> Ability to specify an accumulo index table for an accumulo-hive table.
> This would greatly improve performance for non-rowid query predicates



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977901#comment-15977901
 ] 

Eugene Koifman commented on HIVE-16321:
---

committed to master 
https://github.com/apache/hive/commit/182218b760397e27936c5b9885083cdc774fef90


> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch, 
> HIVE-16321.02.patch, HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Target Version/s: 2.3.0, 3.0.0

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch, 
> HIVE-16321.02.patch, HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-04-20 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977889#comment-15977889
 ] 

Colin Ma commented on HIVE-16465:
-

The failed test is not patch related.

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16465.001.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Status: Patch Available  (was: Open)

branch-2 patch

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch, 
> HIVE-16321.02.patch, HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Attachment: (was: HIVE-16321.01.branch-2.patch)

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch, 
> HIVE-16321.02.patch, HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Attachment: HIVE-16321.01.branch-2.patch

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.branch-2.patch, HIVE-16321.01.patch, 
> HIVE-16321.02.patch, HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16389) Allow HookContext to access SQLOperationDisplay

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977864#comment-15977864
 ] 

Sahil Takiar commented on HIVE-16389:
-

[~spena] rebased the patch and added some more unit tests. Attached updated 
patch file and updated the RB.

> Allow HookContext to access SQLOperationDisplay
> ---
>
> Key: HIVE-16389
> URL: https://issues.apache.org/jira/browse/HIVE-16389
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16389.1.patch, HIVE-16389.2.patch, 
> HIVE-16389.3.patch, HIVE-16389.4.patch
>
>
> There is a lot of useful information in {{SQLOperationDisplay}} that users of 
> Hive Hooks may be interested in.
> We should allow Hive Hooks to access this info by adding the 
> {{SQLOperationDisplay}} to {{HookContext}}.
> This will allow hooks to have access to all information available in the HS2 
> Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16389) Allow HookContext to access SQLOperationDisplay

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16389:

Attachment: HIVE-16389.4.patch

> Allow HookContext to access SQLOperationDisplay
> ---
>
> Key: HIVE-16389
> URL: https://issues.apache.org/jira/browse/HIVE-16389
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16389.1.patch, HIVE-16389.2.patch, 
> HIVE-16389.3.patch, HIVE-16389.4.patch
>
>
> There is a lot of useful information in {{SQLOperationDisplay}} that users of 
> Hive Hooks may be interested in.
> We should allow Hive Hooks to access this info by adding the 
> {{SQLOperationDisplay}} to {{HookContext}}.
> This will allow hooks to have access to all information available in the HS2 
> Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15673) Allow multiple queries with disjunction

2017-04-20 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977822#comment-15977822
 ] 

Vineet Garg commented on HIVE-15673:


Hi [~leftylev] It will be good to document this limitation. IMHO Subqueries in 
the WHERE Clause is the most appropriate place to document this.

> Allow multiple queries with disjunction
> ---
>
> Key: HIVE-15673
> URL: https://issues.apache.org/jira/browse/HIVE-15673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-15673.1.patch, HIVE-15673.2.patch, 
> HIVE-15673.3.patch, HIVE-15673.4.patch, HIVE-15673.5.patch, HIVE-15673.6.patch
>
>
> HIVE currently doesn't allow multiple subqueries with {{OR}} since calcite 
> has a bug in determining logic for OR expression. See [CALCITE-1546 
> |https://issues.apache.org/jira/browse/CALCITE-1546].
> Once calcite is released containing fix for the bug HIVE will need to lift 
> the restriction and add tests cases



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-04-20 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-16489:
--
Component/s: (was: HiveServer2)
 Hive

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-16489
> URL: https://issues.apache.org/jira/browse/HIVE-16489
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>
> I've just analyzed an HMS heap dump. It turns out that it contains a lot of 
> duplicate strings, that waste 26.4% of the heap. Most of them come from 
> HashMaps referenced by 
> org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the 
> relevant section of the jxray (www.jxray.com) report. Looking at 
> Partition.java, I see that in the past somebody has already added code to 
> intern keys and values in the parameters table when it's first set up. 
> However, looks like when more key-value pairs are added, they are not 
> interned, and that probably explains the reason for all these duplicate 
> strings.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> Top duplicate strings:
> Ovhd Num char[]s   Num objs   Value
>  46,088K (0.4%) 58715871  
> "HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s
>  ...[length 4000]"
>  46,088K (0.4%) 58715871  
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG
>  ...[length 4000]"
> ...
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10", 623 of 
> "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG ...[length 
> 4000]", 623 of 
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length 
> 4000]", 623 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]", 623 of 
> "AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length 
> 4000]"
> ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-04-20 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-16489:
--
Description: 
I've just analyzed an HMS heap dump. It turns out that it contains a lot of 
duplicate strings, that waste 26.4% of the heap. Most of them come from 
HashMaps referenced by 
org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the 
relevant section of the jxray (www.jxray.com) report. Looking at 
Partition.java, I see that in the past somebody has already added code to 
intern keys and values in the parameters table when it's first set up. However, 
looks like when more key-value pairs are added, they are not interned, and that 
probably explains the reason for all these duplicate strings.

{code}
6. DUPLICATE STRINGS

Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
Overhead: 3,220,458K (26.4%)

Top duplicate strings:
Ovhd Num char[]s   Num objs   Value

 46,088K (0.4%) 58715871  
"HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s
 ...[length 4000]"
 46,088K (0.4%) 58715871  
"BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG
 ...[length 4000]"
...

===

7. REFERENCE CHAINS FOR DUPLICATE STRINGS

  2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
arrays:
39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 9583 
of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
"BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
3560]"
... and 419200 more strings, of which 36376 are unique
Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
28 of "2", 21 of "0"
 <--  {j.u.HashMap}.values <-- 
org.apache.hadoop.hive.metastore.api.Partition.parameters <--  {j.u.ArrayList} 
<-- 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
 <-- Java Local 
(org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
 [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
  463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing arrays:
7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
"174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
... and 84009 more strings, of which 34065 are unique
Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 of 
"2", 3 of "0"
 <--  {j.u.HashMap}.values <-- 
org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
{j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
  233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 of 
"10", 623 of "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG 
...[length 4000]", 623 of 
"BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length 
4000]", 623 of 
"BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
3560]", 623 of 
"AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length 
4000]"
... and 44568 more strings, of which 27285 are unique
Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of "6", 
29 of "2", 23 of "5", 19 of "9", 2 of "3"
 <--  {j.u.HashMap}.values <-- 
org.apache.hadoop.hive.metastore.api.Partition.parameters <--  {j.u.ArrayList} 
<-- Java Local (j.u.ArrayList) [@4f4cfbd10,@536122408,@726616778]
...
{code}

  was:
I've created a Hive table with 2000 partitions, each backed by two files, with 
one row in each file. When I execute some number of concurrent queries against 
this table, e.g. as follows

{code}
for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
admin -e "select count(i_f_1) from misha_table;" & done
{code}

it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
server with -Xmx200m and with 50 queries - in the one with -Xmx500m.

I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
that was generated in the 50queries/500m heap scenario. It suggests that there 
are several opportunities to reduce memory pressure with not very invasive 
changes to the code. One (duplicate strings) has been addressed in 
https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going to 
address the fact that almost 20% of memory is used by instances of 
java.util.Properties. These objects are highly duplicate, since for each 
partition each concurrently running query creates its own copy of Partion, 
PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
partitions) Properties in memory. By interning/deduplicating these

[jira] [Assigned] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-04-20 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev reassigned HIVE-16489:
-


> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-16489
> URL: https://issues.apache.org/jira/browse/HIVE-16489
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-20 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977805#comment-15977805
 ] 

Misha Dmitriev commented on HIVE-16079:
---

It looks like testCliDriver[vector_if_expr] test is broken - it fails in every 
build.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15786) Provide additional information from the llapstatus command

2017-04-20 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15786:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Provide additional information from the llapstatus command
> --
>
> Key: HIVE-15786
> URL: https://issues.apache.org/jira/browse/HIVE-15786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-15786.01.patch, HIVE-15786.03.patch, 
> HIVE-15786.04.patch, HIVE-15786.05.patch
>
>
> Slider is making enhancements to provide additional information like 
> completed containers, pending containers etc.
> Integrate with this to provide additional details in llapstatus.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15786) Provide additional information from the llapstatus command

2017-04-20 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977797#comment-15977797
 ] 

Siddharth Seth commented on HIVE-15786:
---

Thanks for the review. Committed.

[~owen.omalley] - I'd like to get this into the 2.2 release as well. It could 
not be committed earlier because a slider release was not available.

> Provide additional information from the llapstatus command
> --
>
> Key: HIVE-15786
> URL: https://issues.apache.org/jira/browse/HIVE-15786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-15786.01.patch, HIVE-15786.03.patch, 
> HIVE-15786.04.patch, HIVE-15786.05.patch
>
>
> Slider is making enhancements to provide additional information like 
> completed containers, pending containers etc.
> Integrate with this to provide additional details in llapstatus.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15761) ObjectStore.getNextNotification could return an empty NotificationEventResponse causing TProtocolException

2017-04-20 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977783#comment-15977783
 ] 

Aihua Xu commented on HIVE-15761:
-

Looks good. +1

> ObjectStore.getNextNotification could return an empty 
> NotificationEventResponse causing TProtocolException 
> ---
>
> Key: HIVE-15761
> URL: https://issues.apache.org/jira/browse/HIVE-15761
> Project: Hive
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Sergio Peña
> Attachments: HIVE-15761.1.patch
>
>
> If there is no new events greater than the requested event,  
> ObjectStore.getNextNotification will return an empty 
> NotificationEventResponse. And the client side will get the following 
> exception:
> {noformat} [ERROR - 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:295)]
>  Thrift error occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'events' is 
> unset! Struct:NotificationEventResponse(events:null)
>   at 
> org.apache.hadoop.hive.metastore.api.NotificationEventResponse.validate(NotificationEventResponse.java:310)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.validate(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.write(ThriftHiveMetastore.java)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15761) ObjectStore.getNextNotification could return an empty NotificationEventResponse causing TProtocolException

2017-04-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15761:
---
Status: Patch Available  (was: Open)

> ObjectStore.getNextNotification could return an empty 
> NotificationEventResponse causing TProtocolException 
> ---
>
> Key: HIVE-15761
> URL: https://issues.apache.org/jira/browse/HIVE-15761
> Project: Hive
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Sergio Peña
> Attachments: HIVE-15761.1.patch
>
>
> If there is no new events greater than the requested event,  
> ObjectStore.getNextNotification will return an empty 
> NotificationEventResponse. And the client side will get the following 
> exception:
> {noformat} [ERROR - 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:295)]
>  Thrift error occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'events' is 
> unset! Struct:NotificationEventResponse(events:null)
>   at 
> org.apache.hadoop.hive.metastore.api.NotificationEventResponse.validate(NotificationEventResponse.java:310)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.validate(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.write(ThriftHiveMetastore.java)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15761) ObjectStore.getNextNotification could return an empty NotificationEventResponse causing TProtocolException

2017-04-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15761:
---
Attachment: HIVE-15761.1.patch

I couldn't figure out how to make query.execute() return null from the 
getNextNotification() method. But I walked through the Datanucleus internals, 
and found that null is returned when a NoQueryResultsException is caught on the 
execute() method.

To avoid this error in the future, I am returning an empty 
NotificationEventResponse object with 0 events as normally happens if execute() 
returns 0 results.

> ObjectStore.getNextNotification could return an empty 
> NotificationEventResponse causing TProtocolException 
> ---
>
> Key: HIVE-15761
> URL: https://issues.apache.org/jira/browse/HIVE-15761
> Project: Hive
>  Issue Type: Bug
>Reporter: Hao Hao
>Assignee: Sergio Peña
> Attachments: HIVE-15761.1.patch
>
>
> If there is no new events greater than the requested event,  
> ObjectStore.getNextNotification will return an empty 
> NotificationEventResponse. And the client side will get the following 
> exception:
> {noformat} [ERROR - 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:295)]
>  Thrift error occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'events' is 
> unset! Struct:NotificationEventResponse(events:null)
>   at 
> org.apache.hadoop.hive.metastore.api.NotificationEventResponse.validate(NotificationEventResponse.java:310)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.validate(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result$get_next_notification_resultStandardScheme.write(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_next_notification_result.write(ThriftHiveMetastore.java)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Patch Available  (was: Open)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Open  (was: Patch Available)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485.01.patch, HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15786) Provide additional information from the llapstatus command

2017-04-20 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977694#comment-15977694
 ] 

Prasanth Jayachandran commented on HIVE-15786:
--

+1

> Provide additional information from the llapstatus command
> --
>
> Key: HIVE-15786
> URL: https://issues.apache.org/jira/browse/HIVE-15786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-15786.01.patch, HIVE-15786.03.patch, 
> HIVE-15786.04.patch, HIVE-15786.05.patch
>
>
> Slider is making enhancements to provide additional information like 
> completed containers, pending containers etc.
> Integrate with this to provide additional details in llapstatus.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: HIVE-16485-disableMasking

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16485-disableMasking
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: (was: HIVE-16485.01.patch)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Status: Patch Available  (was: Open)

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16485:
---
Attachment: HIVE-16485.01.patch

This partially revert 16310, 16018, 16142, 15955

> Enable outputName for RS operator in explain formatted
> --
>
> Key: HIVE-16485
> URL: https://issues.apache.org/jira/browse/HIVE-16485
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16213) ObjectStore can leak Queries when rollbackTransaction throws an exception

2017-04-20 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-16213:
---
Attachment: HIVE-16213.02.patch

Thanks for the review [~spena] and [~stakiar]. I was adding a test case as 
suggested by Sahil in the review and I found a few more places in the code 
where the query can leak. Updating the second version of the patch.

> ObjectStore can leak Queries when rollbackTransaction throws an exception
> -
>
> Key: HIVE-16213
> URL: https://issues.apache.org/jira/browse/HIVE-16213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Alexander Kolbasov
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16213.01.patch, HIVE-16213.02.patch
>
>
> In ObjectStore.java there are a few places with the code similar to:
> {code}
> Query query = null;
> try {
>   openTransaction();
>   query = pm.newQuery(Something.class);
>   ...
>   commited = commitTransaction();
> } finally {
>   if (!commited) {
> rollbackTransaction();
>   }
>   if (query != null) {
> query.closeAll();
>   }
> }
> {code}
> The problem is that rollbackTransaction() may throw an exception in which 
> case query.closeAll() wouldn't be executed. 
> The fix would be to wrap rollbackTransaction in its own try-catch block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977619#comment-15977619
 ] 

Xuefu Zhang edited comment on HIVE-11133 at 4/20/17 9:53 PM:
-

[~stakiar], In spark, one non-root stage may have a vertex dependency. For 
instance
{code}
select sum(hash(a.k1,a.v1,a.k2, a.v2))
from (
select src1.key as k1, src1.value as v1, 
   src2.key as k2, src2.value as v2 FROM 
  (select * FROM src WHERE src.key < 10) src1 
JOIN 
  (select src.key, sum(src.value) FROM src WHERE src.key < 10 group by src.key) 
src2
  SORT BY k1, v1, k2, v2
) a
{code}
Basically, the stage that generates the hash table used for mapjoin can have 
vertex dependencies (map -> reduce). I'm not sure if this is the only case, 
though.


was (Author: xuefuz):
[~stakiar], In spark, any stage may have a vertex dependency. For instance
{code}
select sum(hash(a.k1,a.v1,a.k2, a.v2))
from (
select src1.key as k1, src1.value as v1, 
   src2.key as k2, src2.value as v2 FROM 
  (select * FROM src WHERE src.key < 10) src1 
JOIN 
  (select src.key, sum(src.value) FROM src WHERE src.key < 10 group by src.key) 
src2
  SORT BY k1, v1, k2, v2
) a
{code}
Basically, the stage that generates the hash table used for mapjoin can have 
vertex dependencies (map -> reduce). I'm not sure if this is the only case, 
though.

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16213) ObjectStore can leak Queries when rollbackTransaction throws an exception

2017-04-20 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-16213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977621#comment-15977621
 ] 

Sergio Peña commented on HIVE-16213:


The patch looks good [~vihangk1]. 
+1

> ObjectStore can leak Queries when rollbackTransaction throws an exception
> -
>
> Key: HIVE-16213
> URL: https://issues.apache.org/jira/browse/HIVE-16213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Alexander Kolbasov
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16213.01.patch
>
>
> In ObjectStore.java there are a few places with the code similar to:
> {code}
> Query query = null;
> try {
>   openTransaction();
>   query = pm.newQuery(Something.class);
>   ...
>   commited = commitTransaction();
> } finally {
>   if (!commited) {
> rollbackTransaction();
>   }
>   if (query != null) {
> query.closeAll();
>   }
> }
> {code}
> The problem is that rollbackTransaction() may throw an exception in which 
> case query.closeAll() wouldn't be executed. 
> The fix would be to wrap rollbackTransaction in its own try-catch block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977619#comment-15977619
 ] 

Xuefu Zhang commented on HIVE-11133:


[~stakiar], In spark, any stage may have a vertex dependency. For instance
{code}
select sum(hash(a.k1,a.v1,a.k2, a.v2))
from (
select src1.key as k1, src1.value as v1, 
   src2.key as k2, src2.value as v2 FROM 
  (select * FROM src WHERE src.key < 10) src1 
JOIN 
  (select src.key, sum(src.value) FROM src WHERE src.key < 10 group by src.key) 
src2
  SORT BY k1, v1, k2, v2
) a
{code}
Basically, the stage that generates the hash table used for mapjoin can have 
vertex dependencies (map -> reduce). I'm not sure if this is the only case, 
though.

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16485) Enable outputName for RS operator in explain formatted

2017-04-20 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975811#comment-15975811
 ] 

Pengcheng Xiong edited comment on HIVE-16485 at 4/20/17 9:48 PM:
-

sample output
for
{code}
explain formatted
SELECT x.key, z.value, y.value
FROM srcTable x JOIN srcTable y ON (x.key = y.key) 
JOIN srcTable z ON (x.value = z.value)
{code}
{code}
{"STAGE DEPENDENCIES":{"Stage-1":{"ROOT STAGE":"TRUE"},"Stage-0":{"DEPENDENT 
STAGES":"Stage-1"}},"STAGE 
PLANS":{"Stage-1":{"Tez":{"DagId:":"pxiong_20170419172827_bf3a57c0-fa55-437f-8194-49a97b95c4aa:33","Edges:":{"Reducer
 2":[{"parent":"Map 1","type":"SIMPLE_EDGE"},{"parent":"Map 
4","type":"SIMPLE_EDGE"}],"Reducer 3":[{"parent":"Map 
5","type":"SIMPLE_EDGE"},{"parent":"Reducer 
2","type":"SIMPLE_EDGE"}]},"DagName:":"","Vertices:":{"Map 1":{"Map Operator 
Tree:":[{"TableScan":{"alias:":"x","Statistics:":"Num rows: 1 Data size: 0 
Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"TS_0","children":{"Filter Operator":{"predicate:":"(key is 
not null and value is not null) (type: boolean)","Statistics:":"Num rows: 1 
Data size: 0 Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"FIL_22","children":{"Select Operator":{"expressions:":"key 
(type: string), value (type: 
string)","outputColumnNames:":["_col0","_col1"],"Statistics:":"Num rows: 1 Data 
size: 0 Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"SEL_2","children":{"Reduce Output Operator":{"key 
expressions:":"_col0 (type: string)","sort order:":"+","Map-reduce partition 
columns:":"_col0 (type: string)","Statistics:":"Num rows: 1 Data size: 0 Basic 
stats: PARTIAL Column stats: NONE","value expressions:":"_col1 (type: 
string)","OperatorId:":"RS_9","outputname:":"Reducer 2"],"Execution 
mode:":"llap","LLAP IO:":"no inputs"},"Map 4":{"Map Operator 
Tree:":[{"TableScan":{"alias:":"y","Statistics:":"Num rows: 1 Data size: 0 
Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"TS_3","children":{"Filter Operator":{"predicate:":"key is 
not null (type: boolean)","Statistics:":"Num rows: 1 Data size: 0 Basic stats: 
PARTIAL Column stats: NONE","OperatorId:":"FIL_23","children":{"Select 
Operator":{"expressions:":"key (type: string), value (type: 
string)","outputColumnNames:":["_col0","_col1"],"Statistics:":"Num rows: 1 Data 
size: 0 Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"SEL_5","children":{"Reduce Output Operator":{"key 
expressions:":"_col0 (type: string)","sort order:":"+","Map-reduce partition 
columns:":"_col0 (type: string)","Statistics:":"Num rows: 1 Data size: 0 Basic 
stats: PARTIAL Column stats: NONE","value expressions:":"_col1 (type: 
string)","OperatorId:":"RS_10","outputname:":"Reducer 2"],"Execution 
mode:":"llap","LLAP IO:":"no inputs"},"Map 5":{"Map Operator 
Tree:":[{"TableScan":{"alias:":"z","Statistics:":"Num rows: 1 Data size: 0 
Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"TS_6","children":{"Filter Operator":{"predicate:":"value 
is not null (type: boolean)","Statistics:":"Num rows: 1 Data size: 0 Basic 
stats: PARTIAL Column stats: NONE","OperatorId:":"FIL_24","children":{"Select 
Operator":{"expressions:":"value (type: 
string)","outputColumnNames:":["_col0"],"Statistics:":"Num rows: 1 Data size: 0 
Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"SEL_8","children":{"Reduce Output Operator":{"key 
expressions:":"_col0 (type: string)","sort order:":"+","Map-reduce partition 
columns:":"_col0 (type: string)","Statistics:":"Num rows: 1 Data size: 0 Basic 
stats: PARTIAL Column stats: NONE","OperatorId:":"RS_13","outputname:":"Reducer 
3"],"Execution mode:":"llap","LLAP IO:":"no inputs"},"Reducer 
2":{"Execution mode:":"llap","Reduce Operator Tree:":{"Merge Join 
Operator":{"condition map:":[{"":"Inner Join 0 to 1"}],"keys:":{"0":"_col0 
(type: string)","1":"_col0 (type: 
string)"},"outputColumnNames:":["_col0","_col1","_col3"],"Statistics:":"Num 
rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"MERGEJOIN_25","children":{"Reduce Output Operator":{"key 
expressions:":"_col1 (type: string)","sort order:":"+","Map-reduce partition 
columns:":"_col1 (type: string)","Statistics:":"Num rows: 1 Data size: 0 Basic 
stats: PARTIAL Column stats: NONE","value expressions:":"_col0 (type: string), 
_col3 (type: string)","OperatorId:":"RS_12","outputname:":"Reducer 
3"},"Reducer 3":{"Execution mode:":"llap","Reduce Operator Tree:":{"Merge 
Join Operator":{"condition map:":[{"":"Inner Join 0 to 1"}],"keys:":{"0":"_col1 
(type: string)","1":"_col0 (type: 
string)"},"outputColumnNames:":["_col0","_col3","_col4"],"Statistics:":"Num 
rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: 
NONE","OperatorId:":"MERGEJOIN_26","children":{"Select 
Operator":{"expressions:":"_col0 (type: string), _col4 (type: string), _col3 
(type:

[jira] [Commented] (HIVE-16439) Exclude older v2 version of jackson lib from dependent jars in pom.xml

2017-04-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977528#comment-15977528
 ] 

Ashutosh Chauhan commented on HIVE-16439:
-

seems like exclusion is sufficient. 

> Exclude older v2 version of jackson lib from dependent jars in pom.xml 
> ---
>
> Key: HIVE-16439
> URL: https://issues.apache.org/jira/browse/HIVE-16439
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16439.1.patch
>
>
> There are multiple versions of jackson libs included in the dependent jars 
> like spark-client and metrics-json. That causes older versions of jackson 
> libs to be used.   
> We need to exclude them from the dependencies and use the explicit one 
> (currently 2.6.5).
>   com.fasterxml.jackson.core
>   jackson-databind



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16307) add IO memory usage report to LLAP UI

2017-04-20 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977486#comment-15977486
 ] 

Sergey Shelukhin commented on HIVE-16307:
-

Updated. Thanks for pointing this out!

> add IO memory usage report to LLAP UI
> -
>
> Key: HIVE-16307
> URL: https://issues.apache.org/jira/browse/HIVE-16307
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16307.01.patch, HIVE-16307.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977475#comment-15977475
 ] 

Wei Zheng commented on HIVE-16321:
--

+1

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, 
> HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16450) Some metastore operations are not retried even with desired underlining exceptions

2017-04-20 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16450:

Attachment: HIVE-16450.2.patch

> Some metastore operations are not retried even with desired underlining 
> exceptions
> --
>
> Key: HIVE-16450
> URL: https://issues.apache.org/jira/browse/HIVE-16450
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16450.1.patch, HIVE-16450.2.patch
>
>
> In RetryingHMSHandler class, we are expecting the operations should retry 
> when the cause of MetaException is JDOException or NucleusException.
> {noformat}
> if (e.getCause() instanceof MetaException && e.getCause().getCause() 
> != null) {
>   if (e.getCause().getCause() instanceof javax.jdo.JDOException ||
>   e.getCause().getCause() instanceof NucleusException) {
> // The JDOException or the Nucleus Exception may be wrapped 
> further in a MetaException
> caughtException = e.getCause().getCause();
>}
> {noformat}
> While in ObjectStore, many places we are only throwing new MetaException(msg) 
> without the cause, so we are missing retrying for some cases. e.g., with the 
> following JDOException, we should retry but it's ignored.
> {noformat}
> 2017-04-04 17:28:21,602 ERROR metastore.ObjectStore 
> (ObjectStore.java:getMTableColumnStatistics(6555)) - Error retrieving 
> statistics via jdo
> javax.jdo.JDOException: Exception thrown when executing query
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6546)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6606)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6595)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2633)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6594)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6588)
> at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy0.getTableColumnStatistics(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTableUpdateTableColumnStats(HiveAlterHandler.java:787)
> at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:247)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3809)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3779)
> at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at com.sun.proxy.$Proxy3.alter_table_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9617)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9601)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>

[jira] [Updated] (HIVE-16450) Some metastore operations are not retried even with desired underlining exceptions

2017-04-20 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16450:

Attachment: (was: HIVE-16450.2.patch)

> Some metastore operations are not retried even with desired underlining 
> exceptions
> --
>
> Key: HIVE-16450
> URL: https://issues.apache.org/jira/browse/HIVE-16450
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16450.1.patch
>
>
> In RetryingHMSHandler class, we are expecting the operations should retry 
> when the cause of MetaException is JDOException or NucleusException.
> {noformat}
> if (e.getCause() instanceof MetaException && e.getCause().getCause() 
> != null) {
>   if (e.getCause().getCause() instanceof javax.jdo.JDOException ||
>   e.getCause().getCause() instanceof NucleusException) {
> // The JDOException or the Nucleus Exception may be wrapped 
> further in a MetaException
> caughtException = e.getCause().getCause();
>}
> {noformat}
> While in ObjectStore, many places we are only throwing new MetaException(msg) 
> without the cause, so we are missing retrying for some cases. e.g., with the 
> following JDOException, we should retry but it's ignored.
> {noformat}
> 2017-04-04 17:28:21,602 ERROR metastore.ObjectStore 
> (ObjectStore.java:getMTableColumnStatistics(6555)) - Error retrieving 
> statistics via jdo
> javax.jdo.JDOException: Exception thrown when executing query
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6546)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6606)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6595)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2633)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6594)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6588)
> at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy0.getTableColumnStatistics(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTableUpdateTableColumnStats(HiveAlterHandler.java:787)
> at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:247)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3809)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3779)
> at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at com.sun.proxy.$Proxy3.alter_table_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9617)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9601)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977392#comment-15977392
 ] 

Pengcheng Xiong commented on HIVE-11133:


[~stakiar], thanks for reaching out to me. The patch looks good to me as well. 
Glad that most of the code works for spark as well after your refactoring. 
Thanks. :)

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16484:

Attachment: HIVE-16484.1.patch

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16484:

Status: Patch Available  (was: Open)

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16389) Allow HookContext to access SQLOperationDisplay

2017-04-20 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-16389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977351#comment-15977351
 ] 

Sergio Peña commented on HIVE-16389:


[~stakiar] Can you re-submit the patch so it is tested with the changes on 
HIVE-16363?

> Allow HookContext to access SQLOperationDisplay
> ---
>
> Key: HIVE-16389
> URL: https://issues.apache.org/jira/browse/HIVE-16389
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16389.1.patch, HIVE-16389.2.patch, 
> HIVE-16389.3.patch
>
>
> There is a lot of useful information in {{SQLOperationDisplay}} that users of 
> Hive Hooks may be interested in.
> We should allow Hive Hooks to access this info by adding the 
> {{SQLOperationDisplay}} to {{HookContext}}.
> This will allow hooks to have access to all information available in the HS2 
> Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse

2017-04-20 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977306#comment-15977306
 ] 

Vihang Karajgaonkar commented on HIVE-16287:


I think branch-1 is broken. The last pre-commit run on branch-1 based on Git 
log is for HIVE-15833. I see the HiveQA results similar there too.

> Alter table partition rename with location - moves partition back to hive 
> warehouse
> ---
>
> Key: HIVE-16287
> URL: https://issues.apache.org/jira/browse/HIVE-16287
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
> Environment: RHEL 6.8 
>Reporter: Ying Chen
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16287.01.patch, HIVE-16287.02.patch, 
> HIVE-16287.03.patch, HIVE-16287.04.patch, HIVE-16287.05-branch-1.patch, 
> HIVE-16287-addedum.06.patch, HIVE-16287.branch-1.01.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I was renaming my partition in a table that I've created using the location 
> clause, and noticed that when after rename is completed, my partition is 
> moved to the hive warehouse (hive.metastore.warehouse.dir).
> {quote}
> create table test_local_part (col1 int) partitioned by (col2 int) location 
> '/tmp/testtable/test_local_part';
> insert into test_local_part  partition (col2=1) values (1),(3);
> insert into test_local_part  partition (col2=2) values (3);
> alter table test_local_part partition (col2='1') rename to partition 
> (col2='4');
> {quote}
> Running: 
>describe formatted test_local_part partition (col2='2')
> # Detailed Partition Information   
> Partition Value:  [2]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:25:28 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/tmp/testtable/test_local_part/col2=2*
> Running: 
>describe formatted test_local_part partition (col2='4')
> # Detailed Partition Information   
> Partition Value:  [4]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:24:53 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/apps/hive/warehouse/test_local_part/col2=4*
> ---
> Per Sergio's comment - "The rename should create the new partition name in 
> the same location of the table. "



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16363) QueryLifeTimeHooks should catch parse exceptions

2017-04-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16363:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~stakiar] for your contribution. I committed this to master.

> QueryLifeTimeHooks should catch parse exceptions
> 
>
> Key: HIVE-16363
> URL: https://issues.apache.org/jira/browse/HIVE-16363
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-16363.1.patch, HIVE-16363.2.patch, 
> HIVE-16363.3.patch, HIVE-16363.4.patch, HIVE-16363.5.patch, HIVE-16363.6.patch
>
>
> The {{QueryLifeTimeHook}} objects do not catch exceptions during query 
> parsing, only query compilation. New methods should be added to hook into pre 
> and post parsing of the query.
> This should be done in a backwards incompatible way so that current 
> implementations of this hook do not break.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16296) use LLAP executor count to configure reducer auto-parallelism

2017-04-20 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977276#comment-15977276
 ] 

Sergey Shelukhin commented on HIVE-16296:
-

Not sure why data size estimate changed. It changed by a few bytes; could be 
how src or the test tables are written; looks like just the usual statistics 
randomness.

> use LLAP executor count to configure reducer auto-parallelism
> -
>
> Key: HIVE-16296
> URL: https://issues.apache.org/jira/browse/HIVE-16296
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16296.01.patch, HIVE-16296.03.patch, 
> HIVE-16296.04.patch, HIVE-16296.05.patch, HIVE-16296.06.patch, 
> HIVE-16296.07.patch, HIVE-16296.08.patch, HIVE-16296.09.patch, 
> HIVE-16296.10.patch, HIVE-16296.10.patch, HIVE-16296.11.patch, 
> HIVE-16296.12.patch, HIVE-16296.2.patch, HIVE-16296.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977268#comment-15977268
 ] 

ASF GitHub Bot commented on HIVE-16488:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/170

HIVE-16488: Support replicating into existing db if the db is empty

Support REPL LOAD on empty Db (DB with no tables/functions)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-16488

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/170.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #170


commit 381d26026cc3707513f81686d8343e78225cf3c7
Author: Sankar Hariappan 
Date:   2017-04-20T18:47:23Z

HIVE-16488: Support replicating into existing db if the db is empty




> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15795) Support Accumulo Index Tables in Hive Accumulo Connector

2017-04-20 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15795:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch!

> Support Accumulo Index Tables in Hive Accumulo Connector
> 
>
> Key: HIVE-15795
> URL: https://issues.apache.org/jira/browse/HIVE-15795
> Project: Hive
>  Issue Type: Improvement
>  Components: Accumulo Storage Handler
>Reporter: Mike Fagan
>Assignee: Mike Fagan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15795.1.patch, HIVE-15795.2.patch
>
>
> Ability to specify an accumulo index table for an accumulo-hive table.
> This would greatly improve performance for non-rowid query predicates



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-20 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16488:

Status: Patch Available  (was: In Progress)

> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-20 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977267#comment-15977267
 ] 

Sankar Hariappan edited comment on HIVE-16488 at 4/20/17 6:55 PM:
--

Added 01.patch:
- DB is considered empty if have no tables/views and functions.
- If DB exist and is empty, then REPL LOAD will just alter the lastReplID part 
of DB parameters using DDLTask for alter DB.
- There is no way to set the description for the DB in case of empty DB.
- If DB exists and is not empty, then REPL LOAD continues to throw exception.

Request [~sushanth], [~thejas] to review the patch.


was (Author: sankarh):
Added 01.patch:
- DB is considered empty if have no tables/views and functions.
- If DB exist and is empty, then REPL LOAD will just alter the lastReplID part 
of DB parameters using DDLTask for alter DB.
- There is no way to set the description for the DB in case of empty DB.
- If DB exists and is not empty, then REPL LOAD continues to throw exception.


> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
> Attachments: HIVE-16488.01.patch
>
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-10865) Beeline needs to support DELIMITER command

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977255#comment-15977255
 ] 

Sahil Takiar commented on HIVE-10865:
-

[~spena], [~aihuaxu] could you take a look? RB: 
https://reviews.apache.org/r/58318/

> Beeline needs to support DELIMITER command
> --
>
> Key: HIVE-10865
> URL: https://issues.apache.org/jira/browse/HIVE-10865
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Sahil Takiar
> Attachments: HIVE-10865.1.patch, HIVE-10865.2.patch, 
> HIVE-10865.3.patch, HIVE-10865.4.patch, HIVE-10865.5.patch
>
>
> MySQL Client provides a DELIMITER command to set statement delimiter.
> Beeline needs to support a similar command to allow commands having 
> semi-colon as non-statement delimiter (as with MySQL stored procedures). This 
> is a follow-up jira for HIVE-10659



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15982) Support the width_bucket function

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977252#comment-15977252
 ] 

Sahil Takiar commented on HIVE-15982:
-

Sounds good, attaching updated patch. It returns 0 now.

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15982) Support the width_bucket function

2017-04-20 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15982:

Attachment: HIVE-15982.3.patch

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16385) StatsNoJobTask could exit early before all partitions have been processed

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16385:
---
Fix Version/s: 2.3.0

> StatsNoJobTask could exit early before all partitions have been processed
> -
>
> Key: HIVE-16385
> URL: https://issues.apache.org/jira/browse/HIVE-16385
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16385.1.patch
>
>
> For a partitioned table, the class {{StatsNoJobTask}} is supposed to launch 
> threads for all partitions and compute their stats. However, it could exit 
> early after at most 100 seconds:
> {code}
>   private void shutdownAndAwaitTermination(ExecutorService threadPool) {
> // Disable new tasks from being submitted
> threadPool.shutdown();
> try {
>   // Wait a while for existing tasks to terminate
>   if (!threadPool.awaitTermination(100, TimeUnit.SECONDS)) {
> // Cancel currently executing tasks
> threadPool.shutdownNow();
> // Wait a while for tasks to respond to being cancelled
> if (!threadPool.awaitTermination(100, TimeUnit.SECONDS)) {
>   LOG.debug("Stats collection thread pool did not terminate");
> }
>   }
> } catch (InterruptedException ie) {
>   // Cancel again if current thread also interrupted
>   threadPool.shutdownNow();
>   // Preserve interrupt status
>   Thread.currentThread().interrupt();
> }
>   }
> {code}
> The {{shutdown}} call does not wait for all submitted tasks to complete, and 
> the {{awaitTermination}} call waits at most 100 seconds. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16385) StatsNoJobTask could exit early before all partitions have been processed

2017-04-20 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16385:
---
Affects Version/s: 2.0.0
   2.1.0

> StatsNoJobTask could exit early before all partitions have been processed
> -
>
> Key: HIVE-16385
> URL: https://issues.apache.org/jira/browse/HIVE-16385
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16385.1.patch
>
>
> For a partitioned table, the class {{StatsNoJobTask}} is supposed to launch 
> threads for all partitions and compute their stats. However, it could exit 
> early after at most 100 seconds:
> {code}
>   private void shutdownAndAwaitTermination(ExecutorService threadPool) {
> // Disable new tasks from being submitted
> threadPool.shutdown();
> try {
>   // Wait a while for existing tasks to terminate
>   if (!threadPool.awaitTermination(100, TimeUnit.SECONDS)) {
> // Cancel currently executing tasks
> threadPool.shutdownNow();
> // Wait a while for tasks to respond to being cancelled
> if (!threadPool.awaitTermination(100, TimeUnit.SECONDS)) {
>   LOG.debug("Stats collection thread pool did not terminate");
> }
>   }
> } catch (InterruptedException ie) {
>   // Cancel again if current thread also interrupted
>   threadPool.shutdownNow();
>   // Preserve interrupt status
>   Thread.currentThread().interrupt();
> }
>   }
> {code}
> The {{shutdown}} call does not wait for all submitted tasks to complete, and 
> the {{awaitTermination}} call waits at most 100 seconds. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977241#comment-15977241
 ] 

Sahil Takiar commented on HIVE-11133:
-

CC: [~pxiong] this refactors a lot of the code for User-Level Explain, tagging 
you in case you want to take a look as I believe you have been working on this

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (HIVE-16488) Support replicating into existing db if the db is empty

2017-04-20 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16488 started by Sankar Hariappan.
---
> Support replicating into existing db if the db is empty
> ---
>
> Key: HIVE-16488
> URL: https://issues.apache.org/jira/browse/HIVE-16488
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 2.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, Replication
>
> This is a potential usecase where a user may want to manually create a db on 
> destination to make sure it goes to a certain dir root, or they may have 
> cases where the db (default, for instance) was automatically created. We 
> should still allow replicating into this without failing if the db is empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16454) Add blobstore tests for inserting empty into dynamic partition/list bucket tables & inserting cross blobstore tables

2017-04-20 Thread Rentao Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977235#comment-15977235
 ] 

Rentao Wu commented on HIVE-16454:
--

Thanks you Sergio!

> Add blobstore tests for inserting empty into dynamic partition/list bucket 
> tables & inserting cross blobstore tables
> 
>
> Key: HIVE-16454
> URL: https://issues.apache.org/jira/browse/HIVE-16454
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16454.patch
>
>
> This patch introduces two regression tests into the hive-blobstore qtest 
> module: insert_empty_into_blobstore.q and insert_blobstore_to_blobstore.q. 
> These tests the following cases:
> 1.   Insert empty data into dynamic partitioned and list bucketed tables.
> 2.   Insert data from a blobstore table to another blobstore table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16454) Add blobstore tests for inserting empty into dynamic partition/list bucket tables & inserting cross blobstore tables

2017-04-20 Thread Rentao Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977235#comment-15977235
 ] 

Rentao Wu edited comment on HIVE-16454 at 4/20/17 6:35 PM:
---

Thank you Sergio!


was (Author: rentao):
Thanks you Sergio!

> Add blobstore tests for inserting empty into dynamic partition/list bucket 
> tables & inserting cross blobstore tables
> 
>
> Key: HIVE-16454
> URL: https://issues.apache.org/jira/browse/HIVE-16454
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16454.patch
>
>
> This patch introduces two regression tests into the hive-blobstore qtest 
> module: insert_empty_into_blobstore.q and insert_blobstore_to_blobstore.q. 
> These tests the following cases:
> 1.   Insert empty data into dynamic partitioned and list bucketed tables.
> 2.   Insert data from a blobstore table to another blobstore table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16363) QueryLifeTimeHooks should catch parse exceptions

2017-04-20 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977231#comment-15977231
 ] 

Sergio Peña commented on HIVE-16363:


Grea, thanks [~stakiar]
+1

> QueryLifeTimeHooks should catch parse exceptions
> 
>
> Key: HIVE-16363
> URL: https://issues.apache.org/jira/browse/HIVE-16363
> Project: Hive
>  Issue Type: Bug
>  Components: Hooks
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16363.1.patch, HIVE-16363.2.patch, 
> HIVE-16363.3.patch, HIVE-16363.4.patch, HIVE-16363.5.patch, HIVE-16363.6.patch
>
>
> The {{QueryLifeTimeHook}} objects do not catch exceptions during query 
> parsing, only query compilation. New methods should be added to hook into pre 
> and post parsing of the query.
> This should be done in a backwards incompatible way so that current 
> implementations of this hook do not break.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-20 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977217#comment-15977217
 ] 

Sahil Takiar commented on HIVE-11133:
-

[~lirui] thanks for taking a look. Do you have a HoS query in mind that has 
dependencies between non-root stages? I can test it and find out.

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977208#comment-15977208
 ] 

Eugene Koifman commented on HIVE-16321:
---

patches 2 and 3 are the same but have different set of failures except 
vector_if_expr which is a flaky test

no related failures

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, 
> HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16454) Add blobstore tests for inserting empty into dynamic partition/list bucket tables & inserting cross blobstore tables

2017-04-20 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-16454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977211#comment-15977211
 ] 

Sergio Peña commented on HIVE-16454:


Thanks [~rentao] for your contribution. I committed this patch to master, 
branch-2 and branch-2.3.

I just noticed that the branch-2.2 does not have the hive-blobstore integration 
tests. [~owen.omalley] I was wondering if you added blobstore improvements to 
the branch-2.2. Or it is meant to be on 2.3 only?

> Add blobstore tests for inserting empty into dynamic partition/list bucket 
> tables & inserting cross blobstore tables
> 
>
> Key: HIVE-16454
> URL: https://issues.apache.org/jira/browse/HIVE-16454
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16454.patch
>
>
> This patch introduces two regression tests into the hive-blobstore qtest 
> module: insert_empty_into_blobstore.q and insert_blobstore_to_blobstore.q. 
> These tests the following cases:
> 1.   Insert empty data into dynamic partitioned and list bucketed tables.
> 2.   Insert data from a blobstore table to another blobstore table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16454) Add blobstore tests for inserting empty into dynamic partition/list bucket tables & inserting cross blobstore tables

2017-04-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-16454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16454:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add blobstore tests for inserting empty into dynamic partition/list bucket 
> tables & inserting cross blobstore tables
> 
>
> Key: HIVE-16454
> URL: https://issues.apache.org/jira/browse/HIVE-16454
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16454.patch
>
>
> This patch introduces two regression tests into the hive-blobstore qtest 
> module: insert_empty_into_blobstore.q and insert_blobstore_to_blobstore.q. 
> These tests the following cases:
> 1.   Insert empty data into dynamic partitioned and list bucketed tables.
> 2.   Insert data from a blobstore table to another blobstore table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Attachment: (was: HIVE-16321.05.patch)

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, 
> HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Status: Open  (was: Patch Available)

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, 
> HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16454) Add blobstore tests for inserting empty into dynamic partition/list bucket tables & inserting cross blobstore tables

2017-04-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-16454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16454:
---
Fix Version/s: (was: 2.2.0)

> Add blobstore tests for inserting empty into dynamic partition/list bucket 
> tables & inserting cross blobstore tables
> 
>
> Key: HIVE-16454
> URL: https://issues.apache.org/jira/browse/HIVE-16454
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.1.1
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Fix For: 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16454.patch
>
>
> This patch introduces two regression tests into the hive-blobstore qtest 
> module: insert_empty_into_blobstore.q and insert_blobstore_to_blobstore.q. 
> These tests the following cases:
> 1.   Insert empty data into dynamic partitioned and list bucketed tables.
> 2.   Insert data from a blobstore table to another blobstore table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16321) Possible deadlock in metastore with Acid enabled

2017-04-20 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16321:
--
Attachment: (was: HIVE-16321.04.patch)

> Possible deadlock in metastore with Acid enabled
> 
>
> Key: HIVE-16321
> URL: https://issues.apache.org/jira/browse/HIVE-16321
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch, 
> HIVE-16321.03.patch
>
>
> TxnStore.MutexAPI is a mechanism how different Metastore instances can 
> coordinate their operations.  It uses a JDBCConnection to achieve it.
> In some cases this may lead to deadlock.  TxnHandler uses a connection pool 
> of fixed size.  Suppose you have X simultaneous calls to  TxnHandler.lock(), 
> where X is >= size of the pool.  This take all connections form the pool, so 
> when
> {noformat}
> handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
> {noformat} 
> is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the 
> pool is empty and the system is deadlocked.
> MutexAPI can't use the same connection as the operation it's protecting.  
> (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example).
> We could make MutexAPI use a separate connection pool (size > 'primary' conn 
> pool).
> Or we could make TxnHandler.lock(LockRequest rqst) return immediately after 
> enqueueing the lock with the expectation that the caller will always follow 
> up with a call to checkLock(CheckLockRequest rqst).
> cc [~f1sherox]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer

2017-04-20 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977175#comment-15977175
 ] 

Vihang Karajgaonkar edited comment on HIVE-16451 at 4/20/17 6:13 PM:
-

Thanks for finding this out [~pvary]. Although I didn't quite get how the patch 
fixes the race condition. The way I understand the issue is that there is a 
Logging thread and the thread executing the HiveStatement. Both these threads 
are accessing isLogBeingGenerated, isCancelled, isQueryClosed flags in the same 
HiveStatement object. None of these getters and setters are thread safe. I 
think there could be more undiscovered race-conditions in this execution path.


was (Author: vihangk1):
Thanks for finding this out [~pvary]. Although I didn't quite get how the patch 
fixes the race condition. The way I understand the issue is you have Log thread 
and the thread executing the HiveStatement. Both these threads are accessible 
isLogBeingGenerated, isCancelled, isQueryClosed flags in HiveStatement object. 
None of which are thread safe. I think there could be more undiscovered 
race-conditions in this execution path.

> Race condition between HiveStatement.getQueryLog and 
> HiveStatement.runAsyncOnServer
> ---
>
> Key: HIVE-16451
> URL: https://issues.apache.org/jira/browse/HIVE-16451
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, JDBC
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, 
> HIVE-16451.patch
>
>
> During the BeeLineDriver testing I have met the following race condition:
> - Run the query asynchronously through BeeLine
> - Querying the logs in the BeeLine
> In the following code:
> {code:title=HiveStatement.runAsyncOnServer}
>   private void runAsyncOnServer(String sql) throws SQLException {
> checkConnection("execute");
> closeClientOperation();
> initFlags();
> [..]
>   }
> {code}
> {code:title=HiveStatement.getQueryLog}
>   public List getQueryLog(boolean incremental, int fetchSize)
>   throws SQLException, ClosedOrCancelledStatementException {
> [..]
> try {
>   if (stmtHandle != null) {
> [..]
>   } else {
> if (isQueryClosed) {
>   throw new ClosedOrCancelledStatementException("Method getQueryLog() 
> failed. The " +
>   "statement has been closed or cancelled.");
> } else {
>   return logs;
> }
>   }
> } catch (SQLException e) {
> [..]
> }
> [..]
>   }
> {code}
> The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to 
> true:
> {code:title=HiveStatement.closeClientOperation}
>   void closeClientOperation() throws SQLException {
> [..]
> isQueryClosed = true;
> isExecuteStatementFailed = false;
> stmtHandle = null;
>   }
> {code}
> The {{initFlags}} sets it to false:
> {code}
>   private void initFlags() {
> isCancelled = false;
> isQueryClosed = false;
> isLogBeingGenerated = true;
> isExecuteStatementFailed = false;
> isOperationComplete = false;
>   }
> {code}
> If the {{getQueryLog}} is called after the {{closeClientOperation}}, but 
> before the {{initFlags}}, then we will have a following warning if verbose 
> mode is set to true in BeeLine:
> {code}
> Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> getQueryLog() failed. The statement has been closed or cancelled. 
> (state=,code=0)
> {code}
> This caused this fail:
> https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/
> {code}
> Error Message
> Client result comparison failed with error code = 1 while executing 
> fname=smb_mapjoin_11
> 16a17
> > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> > getQueryLog() failed. The statement has been closed or cancelled. 
> > (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-04-20 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977180#comment-15977180
 ] 

Vineet Garg commented on HIVE-15229:


[~cartershanklin] I don't think this patch will help much for supporting 
quantified comparison predicates since this patch is catering to only  'like 
any'. For quantified comparison predicate we will need more generic UDF to deal 
with all datatypes and operators or may be rewrite it to avoid adding UDF. 
Is there a JIRA for supporting quantified comparison predicates?

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16451) Race condition between HiveStatement.getQueryLog and HiveStatement.runAsyncOnServer

2017-04-20 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977175#comment-15977175
 ] 

Vihang Karajgaonkar commented on HIVE-16451:


Thanks for finding this out [~pvary]. Although I didn't quite get how the patch 
fixes the race condition. The way I understand the issue is you have Log thread 
and the thread executing the HiveStatement. Both these threads are accessible 
isLogBeingGenerated, isCancelled, isQueryClosed flags in HiveStatement object. 
None of which are thread safe. I think there could be more undiscovered 
race-conditions in this execution path.

> Race condition between HiveStatement.getQueryLog and 
> HiveStatement.runAsyncOnServer
> ---
>
> Key: HIVE-16451
> URL: https://issues.apache.org/jira/browse/HIVE-16451
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, JDBC
>Affects Versions: 3.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16451.02.patch, HIVE-16451.03.patch, 
> HIVE-16451.patch
>
>
> During the BeeLineDriver testing I have met the following race condition:
> - Run the query asynchronously through BeeLine
> - Querying the logs in the BeeLine
> In the following code:
> {code:title=HiveStatement.runAsyncOnServer}
>   private void runAsyncOnServer(String sql) throws SQLException {
> checkConnection("execute");
> closeClientOperation();
> initFlags();
> [..]
>   }
> {code}
> {code:title=HiveStatement.getQueryLog}
>   public List getQueryLog(boolean incremental, int fetchSize)
>   throws SQLException, ClosedOrCancelledStatementException {
> [..]
> try {
>   if (stmtHandle != null) {
> [..]
>   } else {
> if (isQueryClosed) {
>   throw new ClosedOrCancelledStatementException("Method getQueryLog() 
> failed. The " +
>   "statement has been closed or cancelled.");
> } else {
>   return logs;
> }
>   }
> } catch (SQLException e) {
> [..]
> }
> [..]
>   }
> {code}
> The runAsyncOnServer {{closeClientOperation}} sets {{isQueryClosed}} flag to 
> true:
> {code:title=HiveStatement.closeClientOperation}
>   void closeClientOperation() throws SQLException {
> [..]
> isQueryClosed = true;
> isExecuteStatementFailed = false;
> stmtHandle = null;
>   }
> {code}
> The {{initFlags}} sets it to false:
> {code}
>   private void initFlags() {
> isCancelled = false;
> isQueryClosed = false;
> isLogBeingGenerated = true;
> isExecuteStatementFailed = false;
> isOperationComplete = false;
>   }
> {code}
> If the {{getQueryLog}} is called after the {{closeClientOperation}}, but 
> before the {{initFlags}}, then we will have a following warning if verbose 
> mode is set to true in BeeLine:
> {code}
> Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> getQueryLog() failed. The statement has been closed or cancelled. 
> (state=,code=0)
> {code}
> This caused this fail:
> https://builds.apache.org/job/PreCommit-HIVE-Build/4691/testReport/org.apache.hadoop.hive.cli/TestBeeLineDriver/testCliDriver_smb_mapjoin_11_/
> {code}
> Error Message
> Client result comparison failed with error code = 1 while executing 
> fname=smb_mapjoin_11
> 16a17
> > Warning: org.apache.hive.jdbc.ClosedOrCancelledStatementException: Method 
> > getQueryLog() failed. The statement has been closed or cancelled. 
> > (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15619) Column pruner should handle DruidQuery

2017-04-20 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-15619:

Attachment: HIVE-15619.03.patch

> Column pruner should handle DruidQuery
> --
>
> Key: HIVE-15619
> URL: https://issues.apache.org/jira/browse/HIVE-15619
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
> Attachments: HIVE-15619.01.patch, HIVE-15619.02.patch, 
> HIVE-15619.03.patch
>
>
> Even when we cannot push any operator into Druid, we might be able to prune 
> some of the columns that are read from the Druid sources.
> One solution would be to extend the ColumnPruner so it can push the needed 
> columns into DruidQuery.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15619) Column pruner should handle DruidQuery

2017-04-20 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-15619:

Status: Open  (was: Patch Available)

> Column pruner should handle DruidQuery
> --
>
> Key: HIVE-15619
> URL: https://issues.apache.org/jira/browse/HIVE-15619
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
> Attachments: HIVE-15619.01.patch, HIVE-15619.02.patch, 
> HIVE-15619.03.patch
>
>
> Even when we cannot push any operator into Druid, we might be able to prune 
> some of the columns that are read from the Druid sources.
> One solution would be to extend the ColumnPruner so it can push the needed 
> columns into DruidQuery.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

1 2 >

1 - 100 of 159 matches

Mail list logo