[jira] [Commented] (HIVE-14221) set SQLStdHiveAuthorizerFactoryForTest as default HIVE_AUTHORIZATION_MANAGER

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376341#comment-15376341
 ] 

Hive QA commented on HIVE-14221:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817760/HIVE-14221.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 512 failed/errored test(s), 10303 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.ql.TestTxnCommands.exchangePartition
org.apache.hadoop.hive.ql.TestTxnCommands.testDelete
org.apache.hadoop.hive.ql.TestTxnCommands.testDeleteIn
org.apache.hadoop.hive.ql.TestTxnCommands.testErrors
org.apache.hadoop.hive.ql.TestTxnCommands.testExplicitRollback
org.apache.hadoop.hive.ql.TestTxnCommands.testImplicitRollback
org.apache.hadoop.hive.ql.TestTxnCommands.testInsertOverwrite
org.apache.hadoop.hive.ql.TestTxnCommands.testMultipleDelete
org.apache.hadoop.hive.ql.TestTxnCommands.testMultipleInserts
org.apache.hadoop.hive.ql.TestTxnCommands.testReadMyOwnInsert
org.apache.hadoop.hive.ql.TestTxnCommands.testSimpleAcidInsert
org.apache.hadoop.hive.ql.TestTxnCommands.testTimeOutReaper
org.apache.hadoop.hive.ql.TestTxnCommands.testUpdateDeleteOfInserts
org.apache.hadoop.hive.ql.TestTxnCommands.testUpdateOfInserts
org.apache.hadoop.hive.ql.TestTxnCommands2.testAlterTable
org.apache.hadoop.hive.ql.TestTxnCommands2.testBucketizedInputFormat
org.apache.hadoop.hive.ql.TestTxnCommands2.testDeleteIn
org.apache.hadoop.hive.ql.TestTxnCommands2.testFailHeartbeater
org.apache.hadoop.hive.ql.TestTxnCommands2.testFileSystemUnCaching
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hadoop.hive.ql.TestTxnCommands2.testInsertOverwriteWithSelfJoin
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion1
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion2
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion3
org.apache.hadoop.hive.ql.TestTxnCommands2.testOpenTxnsCounter
org.apache.hadoop.hive.ql.TestTxnCommands2.testOrcNoPPD
org.apache.hadoop.hive.ql.TestTxnCommands2.testOrcPPD
org.apache.hadoop.hive.ql.TestTxnCommands2.testUpdateMixedCase
org.apache.hadoop.hive.ql.TestTxnCommands2.testValidTxnsBookkeeping
org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned
org.apache.hadoop.hive.ql.TestTxnCommands2.writeBetweenWorkerAndCleaner
org.apache.hadoop.hive.ql.exec.TestExecDriver.initializationError
org.apache.hadoop.hive.ql.exec.TestOperators.testFetchOperatorContext
org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperator
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testBuildDag
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testEmptyWork
org.apache.hadoop.hive.ql.hooks.TestHooks.org.apache.hadoop.hive.ql.hooks.TestHooks
org.apache.hadoop.hive.ql.io.TestSymlinkTextInputFormat.testCombine
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLExclusive
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLNoLock
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLShared
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDelete
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testExceptions
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testHeartbeater
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testJoin
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockAcquisitionAndRelease
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockTimeout
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testReadWrite
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testRollback
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadMultiPartition
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadPartition
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadTable
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleWritePartition
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleWriteTable
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testUpdate

[jira] [Assigned] (HIVE-13990) Client should not check dfs.namenode.acls.enabled to determine if extended ACLs are supported

2016-07-13 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome reassigned HIVE-13990:
--

Assignee: Chris Drome

> Client should not check dfs.namenode.acls.enabled to determine if extended 
> ACLs are supported
> -
>
> Key: HIVE-13990
> URL: https://issues.apache.org/jira/browse/HIVE-13990
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13990-branch-1.patch, HIVE-13990.1-branch-1.patch, 
> HIVE-13990.1.patch
>
>
> dfs.namenode.acls.enabled is a server side configuration and the client 
> should not presume to know how the server is configured. Barring a method for 
> querying the NN whether ACLs are supported the client should try and catch 
> the appropriate exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14236) CTAS with UNION ALL puts the wrong stats + count(*) = 0 in Tez

2016-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14236:
---
Summary: CTAS with UNION ALL puts the wrong stats + count(*) = 0 in Tez  
(was: CTAS with UNION ALL puts the wrong stats + count(*) = 0)

> CTAS with UNION ALL puts the wrong stats + count(*) = 0 in Tez
> --
>
> Key: HIVE-14236
> URL: https://issues.apache.org/jira/browse/HIVE-14236
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> to repo. in Tez, create table t as select * from src union all select * from 
> src;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14236) CTAS with UNION ALL puts the wrong stats + count(*) = 0

2016-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14236:
---
Description: to repo. in Tez, create table t as select * from src union all 
select * from src;  (was: to repo. create table t as select * from src union 
all select * from src;)

> CTAS with UNION ALL puts the wrong stats + count(*) = 0
> ---
>
> Key: HIVE-14236
> URL: https://issues.apache.org/jira/browse/HIVE-14236
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> to repo. in Tez, create table t as select * from src union all select * from 
> src;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13974:

Status: Patch Available  (was: In Progress)

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.1.0, 1.3.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch, 
> HIVE-13974.093.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13974:

Attachment: HIVE-13974.093.patch

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch, 
> HIVE-13974.093.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13974:

Status: In Progress  (was: Patch Available)

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 2.1.0, 1.3.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch, 
> HIVE-13974.093.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376258#comment-15376258
 ] 

Matt McCline commented on HIVE-13974:
-

[~sershe] Thank you for the review.  It helped move things forward.  I'm making 
changes.

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14228) Better row count estimates for outer join during physical planning

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376253#comment-15376253
 ] 

Hive QA commented on HIVE-14228:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817718/HIVE-14228.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10318 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/504/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/504/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-504/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12817718 - PreCommit-HIVE-MASTER-Build

> Better row count estimates for outer join during physical planning
> --
>
> Key: HIVE-14228
> URL: https://issues.apache.org/jira/browse/HIVE-14228
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 1.2.0, 2.0.0, 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14228.patch
>
>
> Currently, row counts for all join types are estimated as if they are outer 
> join. We need to update that logic to take into account different join types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14216) CREATE TABLE LIKE doesn't copy some attributes

2016-07-13 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376243#comment-15376243
 ] 

niklaus xiao commented on HIVE-14216:
-

Fixed by this https://issues.apache.org/jira/browse/HIVE-10771

> CREATE TABLE LIKE doesn't copy some attributes
> --
>
> Key: HIVE-14216
> URL: https://issues.apache.org/jira/browse/HIVE-14216
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Hive 1.1
> Hadoop 2.6
>Reporter: Ruslan Dautkhanov
>Priority: Critical
>
> CREATE TABLE LIKE doesn't copy some attributes, like skip.header.line.count 
> We use CREATE TABLE LIKE  to create tables from a template table.
> We have to do re-apply skip.header.line.count=1 every time to the new table, 
> although template table has skip.header.line.count set to 1, CREATE TABLE 
> LIKE does not carry it over to the new table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13191) DummyTable map joins mix up columns between tables

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376140#comment-15376140
 ] 

Hive QA commented on HIVE-13191:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817650/HIVE-13191.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10319 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mapjoin_decimal
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_decimal
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/503/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/503/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-503/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12817650 - PreCommit-HIVE-MASTER-Build

> DummyTable map joins mix up columns between tables
> --
>
> Key: HIVE-13191
> URL: https://issues.apache.org/jira/browse/HIVE-13191
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13191.patch, tez.q
>
>
> {code}
> SELECT
>   a.key,
>   a.a_one,
>   b.b_one,
>   a.a_zero,
>   b.b_zero
> FROM
> (
> SELECT
>   11 key,
>   0 confuse_you,
>   1 a_one,
>   0 a_zero
> ) a
> LEFT JOIN
> (
> SELECT
>   11 key,
>   0 confuse_you,
>   1 b_one,
>   0 b_zero
> ) b
> ON a.key = b.key
> ;
> 11  1   0   0   1
> {code}
> This should be 11, 1, 1, 0, 0 instead. 
> Disabling map-joins & using shuffle-joins returns the right result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9039) Support Union Distinct

2016-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376122#comment-15376122
 ] 

Pengcheng Xiong commented on HIVE-9039:
---

[~wenli], i'm sorry that we do not have any plan to support the query that you 
posted. :). A workaround would be rewriting them as subqueries.

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 1.2.0
>
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, 
> HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch
>
>
> CLEAR LIBRARY CACHE
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14234) TestHiveMetaStorePartitionSpecs does not drop database created in this test causes other test failure

2016-07-13 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao updated HIVE-14234:

Assignee: Mithun Radhakrishnan  (was: niklaus xiao)

> TestHiveMetaStorePartitionSpecs does not drop database created in this test 
> causes other test failure
> -
>
> Key: HIVE-14234
> URL: https://issues.apache.org/jira/browse/HIVE-14234
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.3.0, 2.1.0
>Reporter: niklaus xiao
>Assignee: Mithun Radhakrishnan
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-14234.patch
>
>
> TestHiveMetaStorePartitionSpecs creates a database named 
> testpartitionspecs_db, but never drop it, sometimes causes 
> TestObjectStore#testDatabaseOps failed:
> {code}
> testDatabaseOps(org.apache.hadoop.hive.metastore.TestObjectStore)  Time 
> elapsed: 0.188 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps(TestObjectStore.java:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14234) TestHiveMetaStorePartitionSpecs does not drop database created in this test causes other test failure

2016-07-13 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376113#comment-15376113
 ] 

niklaus xiao commented on HIVE-14234:
-

cc [~alangates]

> TestHiveMetaStorePartitionSpecs does not drop database created in this test 
> causes other test failure
> -
>
> Key: HIVE-14234
> URL: https://issues.apache.org/jira/browse/HIVE-14234
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.3.0, 2.1.0
>Reporter: niklaus xiao
>Assignee: niklaus xiao
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-14234.patch
>
>
> TestHiveMetaStorePartitionSpecs creates a database named 
> testpartitionspecs_db, but never drop it, sometimes causes 
> TestObjectStore#testDatabaseOps failed:
> {code}
> testDatabaseOps(org.apache.hadoop.hive.metastore.TestObjectStore)  Time 
> elapsed: 0.188 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps(TestObjectStore.java:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14234) TestHiveMetaStorePartitionSpecs does not drop database created in this test causes other test failure

2016-07-13 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376111#comment-15376111
 ] 

niklaus xiao commented on HIVE-14234:
-

Small patch.

Can you take a look, since you are the origin author of this code.  [~mithun]

> TestHiveMetaStorePartitionSpecs does not drop database created in this test 
> causes other test failure
> -
>
> Key: HIVE-14234
> URL: https://issues.apache.org/jira/browse/HIVE-14234
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.3.0, 2.1.0
>Reporter: niklaus xiao
>Assignee: niklaus xiao
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-14234.patch
>
>
> TestHiveMetaStorePartitionSpecs creates a database named 
> testpartitionspecs_db, but never drop it, sometimes causes 
> TestObjectStore#testDatabaseOps failed:
> {code}
> testDatabaseOps(org.apache.hadoop.hive.metastore.TestObjectStore)  Time 
> elapsed: 0.188 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps(TestObjectStore.java:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14234) TestHiveMetaStorePartitionSpecs does not drop database created in this test causes other test failure

2016-07-13 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao updated HIVE-14234:

Attachment: HIVE-14234.patch

> TestHiveMetaStorePartitionSpecs does not drop database created in this test 
> causes other test failure
> -
>
> Key: HIVE-14234
> URL: https://issues.apache.org/jira/browse/HIVE-14234
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.3.0, 2.1.0
>Reporter: niklaus xiao
>Assignee: niklaus xiao
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: HIVE-14234.patch
>
>
> TestHiveMetaStorePartitionSpecs creates a database named 
> testpartitionspecs_db, but never drop it, sometimes causes 
> TestObjectStore#testDatabaseOps failed:
> {code}
> testDatabaseOps(org.apache.hadoop.hive.metastore.TestObjectStore)  Time 
> elapsed: 0.188 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps(TestObjectStore.java:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14234) TestHiveMetaStorePartitionSpecs does not drop database created in this test causes other test failure

2016-07-13 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao updated HIVE-14234:

Fix Version/s: 1.3.0
   Status: Patch Available  (was: Open)

> TestHiveMetaStorePartitionSpecs does not drop database created in this test 
> causes other test failure
> -
>
> Key: HIVE-14234
> URL: https://issues.apache.org/jira/browse/HIVE-14234
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.1.0, 1.3.0
>Reporter: niklaus xiao
>Assignee: niklaus xiao
>Priority: Minor
> Fix For: 1.3.0
>
>
> TestHiveMetaStorePartitionSpecs creates a database named 
> testpartitionspecs_db, but never drop it, sometimes causes 
> TestObjectStore#testDatabaseOps failed:
> {code}
> testDatabaseOps(org.apache.hadoop.hive.metastore.TestObjectStore)  Time 
> elapsed: 0.188 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps(TestObjectStore.java:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9039) Support Union Distinct

2016-07-13 Thread wangwenli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376108#comment-15376108
 ] 

wangwenli commented on HIVE-9039:
-

[~pxiong], yes, you are correct, i am not means that , i just want to give a 
clue that, people once write like that, should be chagned after this .
coming to this union feature, now it must be put under subqueries,  any plan to 
remove this limitation, support like this? 

(select key,value from src limit 2)
UNION ALL
(select key,value from src3 limit 3)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 1.2.0
>
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, 
> HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch
>
>
> CLEAR LIBRARY CACHE
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13995) Hive generates inefficient metastore queries for TPCDS tables with 1800+ partitions leading to higher compile time

2016-07-13 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376104#comment-15376104
 ] 

Ashutosh Chauhan commented on HIVE-13995:
-

This will be much more useful if we can do both tasks (retrieving partitions & 
stats) in a single mysql query.

> Hive generates inefficient metastore queries for TPCDS tables with 1800+ 
> partitions leading to higher compile time
> --
>
> Key: HIVE-13995
> URL: https://issues.apache.org/jira/browse/HIVE-13995
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Nita Dembla
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13995.1.patch, HIVE-13995.2.patch
>
>
> TPCDS fact tables (store_sales, catalog_sales) have 1800+ partitions and when 
> the query does not a filter on the partition column, metastore queries 
> generated have a large IN clause listing all the partition names. Most RDBMS 
> systems have issues optimizing large IN clause and even when a good index 
> plan is chosen , comparing to 1800+ string values will not lead to best 
> execution time.
> When all partitions are chosen, not specifying the partition list and having 
> filters only on table and column name will generate the same result set as 
> long as there are no concurrent modifications to partition list of the hive 
> table (adding/dropping partitions).
> For eg: For TPCDS query18, the metastore query gathering partition column 
> statistics runs in 0.5 secs in Mysql. Following is output from mysql log
> {noformat}
> -- Query_time: 0.482063  Lock_time: 0.003037 Rows_sent: 1836  Rows_examined: 
> 18360
> select count("COLUMN_NAME") from "PART_COL_STATS"
>  where "DB_NAME" = 'tpcds_bin_partitioned_orc_3' and "TABLE_NAME" = 
> 'catalog_sales' 
>  and "COLUMN_NAME" in 
> ('cs_bill_customer_sk','cs_bill_cdemo_sk','cs_item_sk','cs_quantity','cs_list_price','cs_sales_price','cs_coupon_amt','cs_net_profit')
>  and "PARTITION_NAME" in 
> ('cs_sold_date_sk=2450815','cs_sold_date_sk=2450816','cs_sold_date_sk=2450817','cs_sold_date_sk=2450818','cs_sold_date_sk=2450819','cs_sold_date_sk=2450820','cs_sold_date_sk=2450821','cs_sold_date_sk=2450822','cs_sold_date_sk=2450823','cs_sold_date_sk=2450824','cs_sold_date_sk=2450825','cs_sold_date_sk=2450826','cs_sold_date_sk=2450827','cs_sold_date_sk=2450828','cs_sold_date_sk=2450829','cs_sold_date_sk=2450830','cs_sold_date_sk=2450831','cs_sold_date_sk=2450832','cs_sold_date_sk=2450833','cs_sold_date_sk=2450834','cs_sold_date_sk=2450835','cs_sold_date_sk=2450836','cs_sold_date_sk=2450837','cs_sold_date_sk=2450838','cs_sold_date_sk=2450839','cs_sold_date_sk=2450840','cs_sold_date_sk=2450841','cs_sold_date_sk=2450842','cs_sold_date_sk=2450843','cs_sold_date_sk=2450844','cs_sold_date_sk=2450845','cs_sold_date_sk=2450846','cs_sold_date_sk=2450847','cs_sold_date_sk=2450848','cs_sold_date_sk=2450849','cs_sold_date_sk=2450850','cs_sold_date_sk=2450851','cs_sold_date_sk=2450852','cs_sold_date_sk=2450853','cs_sold_date_sk=2450854','cs_sold_date_sk=2450855','cs_sold_date_sk=2450856',...,'cs_sold_date_sk=2452654')
>  group by "PARTITION_NAME";
> {noformat}
> Functionally equivalent query runs in 0.1 seconds
> {noformat}
> --Query_time: 0.121296  Lock_time: 0.000156 Rows_sent: 1836  Rows_examined: 
> 18360
> select count("COLUMN_NAME") from "PART_COL_STATS"
>  where "DB_NAME" = 'tpcds_bin_partitioned_orc_3' and "TABLE_NAME" = 
> 'catalog_sales'  and "COLUMN_NAME" in 
> ('cs_bill_customer_sk','cs_bill_cdemo_sk','cs_item_sk','cs_quantity','cs_list_price','cs_sales_price','cs_coupon_amt','cs_net_profit')
>  group by "PARTITION_NAME";
> {noformat}
> If removing the partition list seems drastic, its also possible to simply 
> list the range since hive gets a ordered list of partition names. This 
> performs equally well as earlier query
> {noformat}
> # Query_time: 0.143874  Lock_time: 0.000154 Rows_sent: 1836  Rows_examined: 
> 18360
> SET timestamp=1464014881;
> select count("COLUMN_NAME") from "PART_COL_STATS" where "DB_NAME" = 
> 'tpcds_bin_partitioned_orc_3' and "TABLE_NAME" = 'catalog_sales'  and 
> "COLUMN_NAME" in 
> ('cs_bill_customer_sk','cs_bill_cdemo_sk','cs_item_sk','cs_quantity','cs_list_price','cs_sales_price','cs_coupon_amt','cs_net_profit')
>   and "PARTITION_NAME" >= 'cs_sold_date_sk=2450815' and "PARTITION_NAME" <= 
> 'cs_sold_date_sk=2452654' 
> group by "PARTITION_NAME";
> {noformat}
> Another thing to check is the IN clause of column names. Columns in 
> projection list of hive query are mentioned here. Not sure if statistics of 
> these columns are required for hive query optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11402:

Attachment: HIVE-11402.06.patch

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.05.patch, 
> HIVE-11402.06.patch, HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11402:

Attachment: (was: HIVE-11402.06.patch)

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.05.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11402:

Attachment: HIVE-11402.06.patch

Will commit after HiveQA

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.05.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14187) JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called

2016-07-13 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376088#comment-15376088
 ] 

Mohit Sabharwal commented on HIVE-14187:


Attaching patch after correct rebase this time.

> JDOPersistenceManager objects remain cached if MetaStoreClient#close is not 
> called
> --
>
> Key: HIVE-14187
> URL: https://issues.apache.org/jira/browse/HIVE-14187
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-14187.1.patch, HIVE-14187.2.patch, 
> HIVE-14187.patch, HIVE-14187.patch
>
>
> JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by 
> DataNuclues.
> A new JDOPersistenceManager object gets created for every HMS thread since 
> ObjectStore is a thread local.
> In non-embedded metastore mode, JDOPersistenceManager associated with a 
> thread only gets cleaned up if IMetaStoreClient#close is called by the client 
> (which calls ObjectStore#shutdown which calls JDOPersistenceManager#close 
> which in turn removes the object from cache in 
> JDOPersistenceManagerFactory#releasePersistenceManager
> https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271),
>  i.e. the object will remain cached if client does not call close.
> For example: If one interrupts out of hive CLI shell (instead of using 
> 'exit;' command), SessionState#close does not get called, and hence 
> IMetaStoreClient#close does not get called.
> Instead of relying the client to call close, it's cleaner to automatically 
> perform RawStore related cleanup at the server end via deleteContext() which 
> gets called when the server detects a lost/closed connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14187) JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called

2016-07-13 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-14187:
---
Attachment: HIVE-14187.2.patch

> JDOPersistenceManager objects remain cached if MetaStoreClient#close is not 
> called
> --
>
> Key: HIVE-14187
> URL: https://issues.apache.org/jira/browse/HIVE-14187
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-14187.1.patch, HIVE-14187.2.patch, 
> HIVE-14187.patch, HIVE-14187.patch
>
>
> JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by 
> DataNuclues.
> A new JDOPersistenceManager object gets created for every HMS thread since 
> ObjectStore is a thread local.
> In non-embedded metastore mode, JDOPersistenceManager associated with a 
> thread only gets cleaned up if IMetaStoreClient#close is called by the client 
> (which calls ObjectStore#shutdown which calls JDOPersistenceManager#close 
> which in turn removes the object from cache in 
> JDOPersistenceManagerFactory#releasePersistenceManager
> https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271),
>  i.e. the object will remain cached if client does not call close.
> For example: If one interrupts out of hive CLI shell (instead of using 
> 'exit;' command), SessionState#close does not get called, and hence 
> IMetaStoreClient#close does not get called.
> Instead of relying the client to call close, it's cleaner to automatically 
> perform RawStore related cleanup at the server end via deleteContext() which 
> gets called when the server detects a lost/closed connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13569) Add test for llap file system counters after updating to tez 0.8.3

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-13569.
--
Resolution: Implemented

Implemented as part of HIVE-13258

> Add test for llap file system counters after updating to tez 0.8.3
> --
>
> Key: HIVE-13569
> URL: https://issues.apache.org/jira/browse/HIVE-13569
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Use post hook to print llap counters for *llap.q tests after tez 0.8.3 
> upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9756) LLAP: use log4j 2 for llap (log to separate files, etc.)

2016-07-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376059#comment-15376059
 ] 

Prasanth Jayachandran commented on HIVE-9756:
-

[~sseth] Added 2 loggers "dag-routing" and "query-routing". The service driver 
is provided with an option to specify logger and the default is dag-routing. 
Can you please take another look?

> LLAP: use log4j 2 for llap (log to separate files, etc.)
> 
>
> Key: HIVE-9756
> URL: https://issues.apache.org/jira/browse/HIVE-9756
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gunther Hagleitner
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9756.1.patch, HIVE-9756.2.patch, HIVE-9756.3.patch, 
> HIVE-9756.4.patch, HIVE-9756.4.patch, HIVE-9756.5.patch, HIVE-9756.6.patch, 
> HIVE-9756.7.patch
>
>
> For the INFO logging, we'll need to use the log4j-jcl 2.x upgrade-path to get 
> throughput friendly logging.
> http://logging.apache.org/log4j/2.0/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9756) LLAP: use log4j 2 for llap (log to separate files, etc.)

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9756:

Attachment: HIVE-9756.7.patch

Rebase after HIVE-13258 commit

> LLAP: use log4j 2 for llap (log to separate files, etc.)
> 
>
> Key: HIVE-9756
> URL: https://issues.apache.org/jira/browse/HIVE-9756
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gunther Hagleitner
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9756.1.patch, HIVE-9756.2.patch, HIVE-9756.3.patch, 
> HIVE-9756.4.patch, HIVE-9756.4.patch, HIVE-9756.5.patch, HIVE-9756.6.patch, 
> HIVE-9756.7.patch
>
>
> For the INFO logging, we'll need to use the log4j-jcl 2.x upgrade-path to get 
> throughput friendly logging.
> http://logging.apache.org/log4j/2.0/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14187) JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called

2016-07-13 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376053#comment-15376053
 ] 

Mohit Sabharwal commented on HIVE-14187:


Thanks for review [~vgumashta].  Attaching patch after rebase.

> JDOPersistenceManager objects remain cached if MetaStoreClient#close is not 
> called
> --
>
> Key: HIVE-14187
> URL: https://issues.apache.org/jira/browse/HIVE-14187
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-14187.1.patch, HIVE-14187.patch, HIVE-14187.patch
>
>
> JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by 
> DataNuclues.
> A new JDOPersistenceManager object gets created for every HMS thread since 
> ObjectStore is a thread local.
> In non-embedded metastore mode, JDOPersistenceManager associated with a 
> thread only gets cleaned up if IMetaStoreClient#close is called by the client 
> (which calls ObjectStore#shutdown which calls JDOPersistenceManager#close 
> which in turn removes the object from cache in 
> JDOPersistenceManagerFactory#releasePersistenceManager
> https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271),
>  i.e. the object will remain cached if client does not call close.
> For example: If one interrupts out of hive CLI shell (instead of using 
> 'exit;' command), SessionState#close does not get called, and hence 
> IMetaStoreClient#close does not get called.
> Instead of relying the client to call close, it's cleaner to automatically 
> perform RawStore related cleanup at the server end via deleteContext() which 
> gets called when the server detects a lost/closed connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14187) JDOPersistenceManager objects remain cached if MetaStoreClient#close is not called

2016-07-13 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-14187:
---
Attachment: HIVE-14187.1.patch

> JDOPersistenceManager objects remain cached if MetaStoreClient#close is not 
> called
> --
>
> Key: HIVE-14187
> URL: https://issues.apache.org/jira/browse/HIVE-14187
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-14187.1.patch, HIVE-14187.patch, HIVE-14187.patch
>
>
> JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by 
> DataNuclues.
> A new JDOPersistenceManager object gets created for every HMS thread since 
> ObjectStore is a thread local.
> In non-embedded metastore mode, JDOPersistenceManager associated with a 
> thread only gets cleaned up if IMetaStoreClient#close is called by the client 
> (which calls ObjectStore#shutdown which calls JDOPersistenceManager#close 
> which in turn removes the object from cache in 
> JDOPersistenceManagerFactory#releasePersistenceManager
> https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271),
>  i.e. the object will remain cached if client does not call close.
> For example: If one interrupts out of hive CLI shell (instead of using 
> 'exit;' command), SessionState#close does not get called, and hence 
> IMetaStoreClient#close does not get called.
> Instead of relying the client to call close, it's cleaner to automatically 
> perform RawStore related cleanup at the server end via deleteContext() which 
> gets called when the server detects a lost/closed connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14135) beeline output not formatted correctly for large column widths

2016-07-13 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376046#comment-15376046
 ] 

Vihang Karajgaonkar commented on HIVE-14135:


Thanks [~spena] for taking a look. Can you take a look at the latest review 
(version 2 in the review board)? The setup() method is removed.

I think DEFAULT_MAX_WIDTH is the max line (row) width and 
DEFAULT_MAX_COLUMN_WIDTH is the max width for column. According to 
documentation 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

--maxWidth=MAXWIDTH The maximum column width, in characters, when 
outputformat is table. Default is 15.
Usage: beeline --maxColumnWidth=25

--maxColumnWidth=MAXCOLWIDTHThe maximum column width, in characters, when 
outputformat is table. Default is 15.
Usage: beeline --maxColumnWidth=25



> beeline output not formatted correctly for large column widths
> --
>
> Key: HIVE-14135
> URL: https://issues.apache.org/jira/browse/HIVE-14135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14135.1.patch, HIVE-14135.2.patch, 
> longKeyValues.txt, output_after.txt, output_before.txt
>
>
> If the column width is too large then beeline uses the maximum column width 
> when normalizing all the column widths. In order to reproduce the issue, run 
> set -v; 
> Once the configuration variables is classpath which can be extremely large 
> width (41k characters in my environment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376043#comment-15376043
 ] 

Sergey Shelukhin commented on HIVE-13974:
-

Some comments on RB. Someone else can continue the review too, I will be away 
for a little bit after today.

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375997#comment-15375997
 ] 

Matt McCline commented on HIVE-13974:
-

Good run.  Test failures appear unrelated.

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification

2016-07-13 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-13989:
---
Description: 
Hive takes two approaches to working with extended ACLs depending on whether 
data is being produced via a Hive query or HCatalog APIs. A Hive query will run 
an FsShell command to recursively set the extended ACLs for a directory 
sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
programmatically and runs some code to set the ACLs to match the parent 
directory.

Some incorrect assumptions were made when implementing the extended ACLs 
support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
design documents of extended ACLs in HDFS. These documents model the 
implementation after the POSIX implementation on Linux, which can be found at 
http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.

The code for setting extended ACLs via HCatalog APIs is found in HdfsUtils.java:

{code}
if (aclEnabled) {
  aclStatus =  sourceStatus.getAclStatus();
  if (aclStatus != null) {
LOG.trace(aclStatus.toString());
aclEntries = aclStatus.getEntries();
removeBaseAclEntries(aclEntries);

//the ACL api's also expect the tradition user/group/other permission 
in the form of ACL
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
sourcePerm.getUserAction()));
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
sourcePerm.getGroupAction()));
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
sourcePerm.getOtherAction()));
  }
}
{code}

We found that DEFAULT extended ACL rules were not being inherited properly by 
the directory sub-tree, so the above code is incomplete because it effectively 
drops the DEFAULT rules. The second problem is with the call to 
{{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
ACLs. When extended ACLs are used the GROUP permission is replaced with the 
extended ACL mask. So the above code will apply the wrong permissions to the 
GROUP. Instead the correct GROUP permissions now need to be pulled from the 
AclEntry as returned by {{getAclStatus().getEntries()}}. See the implementation 
of the new method {{getDefaultAclEntries}} for details.

Similar issues exist with the HCatalog API. None of the API accounts for 
setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
API allow the extended ACLs to be passed into the required methods similar to 
how basic permissions are passed in. When building the directory sub-tree the 
extended ACLs of the table directory are inherited by all sub-directories, 
including the DEFAULT rules.

Replicating the problem:

Create a table to write data into (I will use acl_test as the destination and 
words_text as the source) and set the ACLs as follows:

{noformat}
$ hdfs dfs -setfacl -m 
default:user::rwx,default:group::r-x,default:mask::rwx,default:user:hdfs:rwx,group::r-x,user:hdfs:rwx
 /user/cdrome/hive/acl_test

$ hdfs dfs -ls -d /user/cdrome/hive/acl_test
drwxrwx---+  - cdrome hdfs  0 2016-07-13 20:36 
/user/cdrome/hive/acl_test

$ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
# file: /user/cdrome/hive/acl_test
# owner: cdrome
# group: hdfs
user::rwx
user:hdfs:rwx
group::r-x
mask::rwx
other::---
default:user::rwx
default:user:hdfs:rwx
default:group::r-x
default:mask::rwx
default:other::---
{noformat}

Note that the basic GROUP permission is set to {{rwx}} after setting the ACLs. 
The ACLs explicitly set the DEFAULT rules and a rule specifically for the 
{{hdfs}} user.

Run the following query to populate the table:

{noformat}
insert into acl_test partition (dt='a', ds='b') select a, b from words_text 
where dt = 'c';
{noformat}

Note that words_text only has a single partition key.

Now examine the ACLs for the resulting directories:

{noformat}
$ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test
# file: /user/cdrome/hive/acl_test
# owner: cdrome
# group: hdfs
user::rwx
user:hdfs:rwx
group::r-x
mask::rwx
other::---
default:user::rwx
default:user:hdfs:rwx
default:group::r-x
default:mask::rwx
default:other::---

# file: /user/cdrome/hive/acl_test/dt=a
# owner: cdrome
# group: hdfs
user::rwx
user:hdfs:rwx
group::rwx
mask::rwx
other::---
default:user::rwx
default:user:hdfs:rwx
default:group::rwx
default:mask::rwx
default:other::---

# file: /user/cdrome/hive/acl_test/dt=a/ds=b
# owner: cdrome
# group: hdfs
user::rwx
user:hdfs:rwx
group::rwx
mask::rwx
other::---
default:user::rwx
default:user:hdfs:rwx
default:group::rwx
default:mask::rwx
default:other::---

# file: /user/cdrome/hive/acl_test/dt=a/ds=b/00_0.deflate
# owner: cdrome
# group: hdfs
user::rwx
user:hdfs:rwx
group::rwx
mask::rwx
other::---
{noformat}

Note that the GROUP permission is now erroneously set to {{rwx}} because of the 
code mentioned above; it is set to the 

[jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375990#comment-15375990
 ] 

Hive QA commented on HIVE-13974:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817591/HIVE-13974.092.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10318 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/502/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/502/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-502/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12817591 - PreCommit-HIVE-MASTER-Build

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch, HIVE-13974.05.WIP.patch, 
> HIVE-13974.06.patch, HIVE-13974.07.patch, HIVE-13974.08.patch, 
> HIVE-13974.09.patch, HIVE-13974.091.patch, HIVE-13974.092.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10328) Enable new return path for cbo

2016-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-10328:
---
Status: Patch Available  (was: Open)

Turning on return path flag on to see what tests are failing

> Enable new return path for cbo
> --
>
> Key: HIVE-10328
> URL: https://issues.apache.org/jira/browse/HIVE-10328
> Project: Hive
>  Issue Type: Task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
> Attachments: HIVE-10328.1.patch, HIVE-10328.10.patch, 
> HIVE-10328.11.patch, HIVE-10328.12.patch, HIVE-10328.13.patch, 
> HIVE-10328.14.patch, HIVE-10328.2.patch, HIVE-10328.3.patch, 
> HIVE-10328.4.patch, HIVE-10328.4.patch, HIVE-10328.5.patch, 
> HIVE-10328.6.patch, HIVE-10328.7.patch, HIVE-10328.8.patch, 
> HIVE-10328.9.patch, HIVE-10328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14135) beeline output not formatted correctly for large column widths

2016-07-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375959#comment-15375959
 ] 

Sergio Peña commented on HIVE-14135:


The patch looks good. Just a couple of questions:

TestBufferedRows
- Is the setUp() method needed? Can we remove it?

BeeLineOpts
- What's the difference between DEFAULT_MAX_COLUMN_WIDTH and DEFAULT_MAX_WIDTH? 
Why do they have 80 vs 50?
- Is there a way to specify the max width from a command line parameter?

> beeline output not formatted correctly for large column widths
> --
>
> Key: HIVE-14135
> URL: https://issues.apache.org/jira/browse/HIVE-14135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14135.1.patch, HIVE-14135.2.patch, 
> longKeyValues.txt, output_after.txt, output_before.txt
>
>
> If the column width is too large then beeline uses the maximum column width 
> when normalizing all the column widths. In order to reproduce the issue, run 
> set -v; 
> Once the configuration variables is classpath which can be extremely large 
> width (41k characters in my environment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10328) Enable new return path for cbo

2016-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-10328:
--

Assignee: Vineet Garg  (was: Ashutosh Chauhan)

> Enable new return path for cbo
> --
>
> Key: HIVE-10328
> URL: https://issues.apache.org/jira/browse/HIVE-10328
> Project: Hive
>  Issue Type: Task
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
> Attachments: HIVE-10328.1.patch, HIVE-10328.10.patch, 
> HIVE-10328.11.patch, HIVE-10328.12.patch, HIVE-10328.13.patch, 
> HIVE-10328.2.patch, HIVE-10328.3.patch, HIVE-10328.4.patch, 
> HIVE-10328.4.patch, HIVE-10328.5.patch, HIVE-10328.6.patch, 
> HIVE-10328.7.patch, HIVE-10328.8.patch, HIVE-10328.9.patch, HIVE-10328.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375942#comment-15375942
 ] 

Sergey Shelukhin commented on HIVE-14074:
-

+1

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
> Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, 
> HIVE-14074.03.patch
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14167) Use work directories provided by Tez instead of directly using YARN local dirs

2016-07-13 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-14167:


Assignee: Wei Zheng

> Use work directories provided by Tez instead of directly using YARN local dirs
> --
>
> Key: HIVE-14167
> URL: https://issues.apache.org/jira/browse/HIVE-14167
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Siddharth Seth
>Assignee: Wei Zheng
>
> HIVE-13303 fixed things to use multiple directories instead of a single tmp 
> directory. However it's using yarn-local-dirs directly.
> I'm not sure how well using the yarn-local-dir will work on a secure cluster.
> Would be better to use Tez*Context.getWorkDirs. This provides an app specific 
> directory - writable by the user.
> cc [~sershe]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14231) timestamp support is limited to 4 digit year

2016-07-13 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-14231:
-
Reporter: Takahiko Saito  (was: Thejas M Nair)

> timestamp support is limited to 4 digit year
> 
>
> Key: HIVE-14231
> URL: https://issues.apache.org/jira/browse/HIVE-14231
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Takahiko Saito
>
> Hive doesn't handle timestamp type that have a year with more than 4 digits.
> This limitation seems to be primarily around string to timestamp conversion.
> {code}
> Following insert query would insert NULL record -
> create table ts_test (t timestamp);
> insert into ts_test values ('2015-01-01 1:1:1');
> insert into ts_test values ('20151-01-01 1:1:1');
> select CAST(t as String)  from ts_test;
> +--+--+
> |  t   |
> +--+--+
> | 2015-01-01 01:01:01  |
> | NULL |
> +--+--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14137) Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty tables

2016-07-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-14137:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

> Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty 
> tables
> ---
>
> Key: HIVE-14137
> URL: https://issues.apache.org/jira/browse/HIVE-14137
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-14137.1.patch, HIVE-14137.2.patch, 
> HIVE-14137.3.patch, HIVE-14137.4.patch, HIVE-14137.5.patch, 
> HIVE-14137.6.patch, HIVE-14137.patch
>
>
> The following queries:
> {code}
> -- Setup
> drop table if exists empty1;
> create table empty1 (col1 bigint) stored as parquet tblproperties 
> ('parquet.compress'='snappy');
> drop table if exists empty2;
> create table empty2 (col1 bigint, col2 bigint) stored as parquet 
> tblproperties ('parquet.compress'='snappy');
> drop table if exists empty3;
> create table empty3 (col1 bigint) stored as parquet tblproperties 
> ('parquet.compress'='snappy');
> -- All empty HDFS directories.
> -- Fails with [08S01]: Error while processing statement: FAILED: Execution 
> Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> -- Two empty HDFS directories.
> -- Create an empty file in HDFS.
> insert into empty1 select * from empty1 where false;
> -- Same query fails with [08S01]: Error while processing statement: FAILED: 
> Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> -- One empty HDFS directory.
> -- Create an empty file in HDFS.
> insert into empty2 select * from empty2 where false;
> -- Same query succeeds.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> {code}
> Will result in the following exception:
> {code}
> org.apache.hadoop.fs.FileAlreadyExistsException: 
> /tmp/hive/hive/1f3837aa-9407-4780-92b1-42a66d205139/hive_2016-06-24_15-45-23_206_79177714958655528-2/-mr-10004/0/emptyFile
>  for client 172.26.14.151 already exists
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2784)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2561)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:593)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:111)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:393)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1902)
>   at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1738)
>   at 

[jira] [Updated] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11402:

Attachment: HIVE-11402.05.patch

Changed the test to use sleep UDF, and sleep for half a second

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.05.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14137) Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty tables

2016-07-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375933#comment-15375933
 ] 

Sergio Peña commented on HIVE-14137:


Patch looks good.
+1

> Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty 
> tables
> ---
>
> Key: HIVE-14137
> URL: https://issues.apache.org/jira/browse/HIVE-14137
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-14137.1.patch, HIVE-14137.2.patch, 
> HIVE-14137.3.patch, HIVE-14137.4.patch, HIVE-14137.5.patch, 
> HIVE-14137.6.patch, HIVE-14137.patch
>
>
> The following queries:
> {code}
> -- Setup
> drop table if exists empty1;
> create table empty1 (col1 bigint) stored as parquet tblproperties 
> ('parquet.compress'='snappy');
> drop table if exists empty2;
> create table empty2 (col1 bigint, col2 bigint) stored as parquet 
> tblproperties ('parquet.compress'='snappy');
> drop table if exists empty3;
> create table empty3 (col1 bigint) stored as parquet tblproperties 
> ('parquet.compress'='snappy');
> -- All empty HDFS directories.
> -- Fails with [08S01]: Error while processing statement: FAILED: Execution 
> Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> -- Two empty HDFS directories.
> -- Create an empty file in HDFS.
> insert into empty1 select * from empty1 where false;
> -- Same query fails with [08S01]: Error while processing statement: FAILED: 
> Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> -- One empty HDFS directory.
> -- Create an empty file in HDFS.
> insert into empty2 select * from empty2 where false;
> -- Same query succeeds.
> select empty1.col1
> from empty1
> inner join empty2
> on empty2.col1 = empty1.col1
> inner join empty3
> on empty3.col1 = empty2.col2;
> {code}
> Will result in the following exception:
> {code}
> org.apache.hadoop.fs.FileAlreadyExistsException: 
> /tmp/hive/hive/1f3837aa-9407-4780-92b1-42a66d205139/hive_2016-06-24_15-45-23_206_79177714958655528-2/-mr-10004/0/emptyFile
>  for client 172.26.14.151 already exists
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2784)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2561)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:593)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:111)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:393)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1902)
>   at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1738)
>   at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1663)
>   at 
> 

[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release

2016-07-13 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375930#comment-15375930
 ] 

Shannon Ladymon commented on HIVE-14007:


Ah, that makes sense.  Thanks!

> Replace ORC module with ORC release
> ---
>
> Key: HIVE-14007
> URL: https://issues.apache.org/jira/browse/HIVE-14007
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch
>
>
> This completes moving the core ORC reader & writer to the ORC project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-07-13 Thread Abdullah Yousufi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375929#comment-15375929
 ] 

Abdullah Yousufi commented on HIVE-14074:
-

Yeah the regex will return all permanent UDFs, as the format for those 
functions is ".". This will exclude built-in and 
temporary functions, which should not be affected by the reload.

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
> Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, 
> HIVE-14074.03.patch
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375912#comment-15375912
 ] 

Sergey Shelukhin commented on HIVE-14074:
-

Looks mostly good, one question - why is it using the regex with a dot in the 
middle? Is there anything special about filtering functions like that - doesn't 
the method already return FQ names?

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
> Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, 
> HIVE-14074.03.patch
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14230) Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375903#comment-15375903
 ] 

Sergey Shelukhin commented on HIVE-14230:
-

+1

> Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI
> --
>
> Key: HIVE-14230
> URL: https://issues.apache.org/jira/browse/HIVE-14230
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14230.1.patch
>
>
> Hadoop23Shims.cloneUgi() creates a Subject using the default constructor, 
> leaving the newly created subject with empty credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380

2016-07-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375902#comment-15375902
 ] 

Xuefu Zhang commented on HIVE-14077:


RE. Also I thought Hive does strive to be ANSI compliant

Yeah, but there are limits on what we can do, especially when b/c is a concern.

> revert or fix HIVE-13380
> 
>
> Key: HIVE-14077
> URL: https://issues.apache.org/jira/browse/HIVE-14077
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> See comments in that JIRA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-07-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375886#comment-15375886
 ] 

Sergio Peña commented on HIVE-14074:


The patch looks good based on [~sershe] comments. [~sershe] Could you give a 
quick review on the patch?

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
> Attachments: HIVE-14074.01.patch, HIVE-14074.02.patch, 
> HIVE-14074.03.patch
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14230) Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI

2016-07-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14230:
--
Attachment: HIVE-14230.1.patch

Attaching patch - use the other Subject constructor which allows credentials to 
be passed in.

> Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI
> --
>
> Key: HIVE-14230
> URL: https://issues.apache.org/jira/browse/HIVE-14230
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14230.1.patch
>
>
> Hadoop23Shims.cloneUgi() creates a Subject using the default constructor, 
> leaving the newly created subject with empty credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14230) Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI

2016-07-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14230:
--
Status: Patch Available  (was: Open)

> Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI
> --
>
> Key: HIVE-14230
> URL: https://issues.apache.org/jira/browse/HIVE-14230
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14230.1.patch
>
>
> Hadoop23Shims.cloneUgi() creates a Subject using the default constructor, 
> leaving the newly created subject with empty credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14122) VectorMapOperator: Missing update to AbstractMapOperator::numRows

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14122:
-
Fix Version/s: 2.1.1

> VectorMapOperator: Missing update to AbstractMapOperator::numRows
> -
>
> Key: HIVE-14122
> URL: https://issues.apache.org/jira/browse/HIVE-14122
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14122.1.patch
>
>
> The INPUT_RECORDS counter is out of sync with the actual # of rows-read in 
> vectorized and non-vectorized modes.
> This means Tez record summaries are off by a large margin or is 0 for those 
> vertices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14122) VectorMapOperator: Missing update to AbstractMapOperator::numRows

2016-07-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375854#comment-15375854
 ] 

Prasanth Jayachandran commented on HIVE-14122:
--

Committed to branch-2.1 as well.

> VectorMapOperator: Missing update to AbstractMapOperator::numRows
> -
>
> Key: HIVE-14122
> URL: https://issues.apache.org/jira/browse/HIVE-14122
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14122.1.patch
>
>
> The INPUT_RECORDS counter is out of sync with the actual # of rows-read in 
> vectorized and non-vectorized modes.
> This means Tez record summaries are off by a large margin or is 0 for those 
> vertices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14161) from_utc_timestamp()/to_utc_timestamp return incorrect results with EST

2016-07-13 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang resolved HIVE-14161.

Resolution: Not A Problem

I do not think it is a bug since different IDs with a TimeZone can have 
different DTS behavior.

> from_utc_timestamp()/to_utc_timestamp return incorrect results with EST
> ---
>
> Key: HIVE-14161
> URL: https://issues.apache.org/jira/browse/HIVE-14161
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> {code}
> hive> SELECT to_utc_timestamp('2016-06-30 06:00:00', 'PST'); 
> OK
> 2016-06-30 13:00:00  ==>Correct, UTC is 7 hours ahead of PST
> Time taken: 1.674 seconds, Fetched: 1 row(s)
> hive> SELECT to_utc_timestamp('2016-06-30 08:00:00', 'CST');
> OK
> 2016-06-30 13:00:00  ==>Correct, UTC is 5 hours ahead of CST
> Time taken: 1.776 seconds, Fetched: 1 row(s)
> hive> SELECT to_utc_timestamp('2016-06-30 09:00:00', 'EST');
> OK
> 2016-06-30 14:00:00  ==>Wrong, UTC should be 4 hours ahead of EST
> Time taken: 1.686 seconds, Fetched: 1 row(s)
> hive> select from_utc_timestamp('2016-06-30 14:00:00', 'EST');
> OK
> 2016-06-30 09:00:00  ==>Wrong, UTC should be 4 hours ahead of EST
> {code}
> It might be something related to daylight savings time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375829#comment-15375829
 ] 

Thejas M Nair commented on HIVE-11402:
--

Test case looks good. (though i would have preferred the use of a sleepUDF to 
guarantee how long the first query would take).
Just some minor comments about comments in reviewboard.
+1 with the review comments addressed.


> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file

2016-07-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375820#comment-15375820
 ] 

Eugene Koifman edited comment on HIVE-13369 at 7/13/16 9:46 PM:


most failed tests have age > 2
list_bucket_dml_12 fails on and off (e.g. 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/499/testReport/)

todo: check auto_sortmerge_join_2



was (Author: ekoifman):
most failed tests have age > 2
list_bucket_dml_12 fails on and off (e.g. 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/499/testReport/)


> AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing 
> the "best" base file
> --
>
> Key: HIVE-13369
> URL: https://issues.apache.org/jira/browse/HIVE-13369
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-13369.1.patch, HIVE-13369.2.patch, 
> HIVE-13369.3.patch
>
>
> The JavaDoc on getAcidState() reads, in part:
> "Note that because major compactions don't
>preserve the history, we can't use a base directory that includes a
>transaction id that we must exclude."
> which is correct but there is nothing in the code that does this.
> And if we detect a situation where txn X must be excluded but and there are 
> deltas that contain X, we'll have to abort the txn.  This can't (reasonably) 
> happen with auto commit mode, but with multi statement txns it's possible.
> Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An 
> hour later it decides to access some partition for which all txns < 20 (for 
> example) have already been compacted (i.e. GC'd).  
> ==
> Here is a more concrete example.  Let's say the file for table A are as 
> follows and created in the order listed.
> delta_4_4
> delta_5_5
> delta_4_5
> base_5
> delta_16_16
> delta_17_17
> base_17  (for example user ran major compaction)
> let's say getAcidState() is called with ValidTxnList(20:16), i.e. with HWM=20 
> and ExceptionList=<16>
> Assume that all txns <= 20 commit.
> Reader can't use base_17 because it has result of txn16.  So it should chose 
> base_5 "TxnBase bestBase" in _getChildState()_.
> Then the reset of the logic in _getAcidState()_ should choose delta_16_16 and 
> delta_17_17 in _Directory_ object.  This would represent acceptable snapshot 
> for such reader.
> The issue is if at the same time the Cleaner process is running.  It will see 
> everything with txnid<17 as obsolete.  Then it will check lock manger state 
> and decide to delete (as there may not be any locks in LM for table A).  The 
> order in which the files are deleted is undefined right now.  It may delete 
> delta_16_16 and delta_17_17 first and right at this moment the read request 
> with ValidTxnList(20:16) arrives (such snapshot may have bee locked in by 
> some multi-stmt txn that started some time ago.  It acquires locks after the 
> Cleaner checks LM state and calls getAcidState(). This request will choose 
> base_5 but it won't see delta_16_16 and delta_17_17 and thus return the 
> snapshot w/o modifications made by those txns.
> [This is not possible currently since we only support autoCommit=true.  The 
> reason is the a query (0) opens txn (if appropriate), (1) acquires locks, (2) 
> locks in the snapshot.  The cleaner won't delete anything for a given 
> compaction (partition) if there are locks on it.  Thus for duration of the 
> transaction, nothing will be deleted so it's safe to use base_5]
> This is a subtle race condition but possible.
> 1. So the safest thing to do to ensure correctness is to use the latest 
> base_x as the "best" and check against exceptions in ValidTxnList and throw 
> an exception if there is an exception <=x.
> 2. A better option is to keep 2 exception lists: aborted and open and only 
> throw if there is an open txn <=x.  Compaction throws away data from aborted 
> txns and thus there is no harm using base with aborted txns in its range.
> 3. You could make each txn record the lowest open txn id at its start and 
> prevent the cleaner from cleaning anything delta with id range that includes 
> this open txn id for any txn that is still running.  This has a drawback of 
> potentially delaying GC of old files for arbitrarily long periods.  So this 
> should be a user config choice.   The implementation is not trivial.
> I would go with 1 now and do 2/3 together with multi-statement txn work.
> Side note:  if 2 deltas have overlapping ID range, then 1 must be a subset of 
> the other



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file

2016-07-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375820#comment-15375820
 ] 

Eugene Koifman commented on HIVE-13369:
---

most failed tests have age > 2
list_bucket_dml_12 fails on and off (e.g. 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/499/testReport/)


> AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing 
> the "best" base file
> --
>
> Key: HIVE-13369
> URL: https://issues.apache.org/jira/browse/HIVE-13369
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-13369.1.patch, HIVE-13369.2.patch, 
> HIVE-13369.3.patch
>
>
> The JavaDoc on getAcidState() reads, in part:
> "Note that because major compactions don't
>preserve the history, we can't use a base directory that includes a
>transaction id that we must exclude."
> which is correct but there is nothing in the code that does this.
> And if we detect a situation where txn X must be excluded but and there are 
> deltas that contain X, we'll have to abort the txn.  This can't (reasonably) 
> happen with auto commit mode, but with multi statement txns it's possible.
> Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An 
> hour later it decides to access some partition for which all txns < 20 (for 
> example) have already been compacted (i.e. GC'd).  
> ==
> Here is a more concrete example.  Let's say the file for table A are as 
> follows and created in the order listed.
> delta_4_4
> delta_5_5
> delta_4_5
> base_5
> delta_16_16
> delta_17_17
> base_17  (for example user ran major compaction)
> let's say getAcidState() is called with ValidTxnList(20:16), i.e. with HWM=20 
> and ExceptionList=<16>
> Assume that all txns <= 20 commit.
> Reader can't use base_17 because it has result of txn16.  So it should chose 
> base_5 "TxnBase bestBase" in _getChildState()_.
> Then the reset of the logic in _getAcidState()_ should choose delta_16_16 and 
> delta_17_17 in _Directory_ object.  This would represent acceptable snapshot 
> for such reader.
> The issue is if at the same time the Cleaner process is running.  It will see 
> everything with txnid<17 as obsolete.  Then it will check lock manger state 
> and decide to delete (as there may not be any locks in LM for table A).  The 
> order in which the files are deleted is undefined right now.  It may delete 
> delta_16_16 and delta_17_17 first and right at this moment the read request 
> with ValidTxnList(20:16) arrives (such snapshot may have bee locked in by 
> some multi-stmt txn that started some time ago.  It acquires locks after the 
> Cleaner checks LM state and calls getAcidState(). This request will choose 
> base_5 but it won't see delta_16_16 and delta_17_17 and thus return the 
> snapshot w/o modifications made by those txns.
> [This is not possible currently since we only support autoCommit=true.  The 
> reason is the a query (0) opens txn (if appropriate), (1) acquires locks, (2) 
> locks in the snapshot.  The cleaner won't delete anything for a given 
> compaction (partition) if there are locks on it.  Thus for duration of the 
> transaction, nothing will be deleted so it's safe to use base_5]
> This is a subtle race condition but possible.
> 1. So the safest thing to do to ensure correctness is to use the latest 
> base_x as the "best" and check against exceptions in ValidTxnList and throw 
> an exception if there is an exception <=x.
> 2. A better option is to keep 2 exception lists: aborted and open and only 
> throw if there is an open txn <=x.  Compaction throws away data from aborted 
> txns and thus there is no harm using base with aborted txns in its range.
> 3. You could make each txn record the lowest open txn id at its start and 
> prevent the cleaner from cleaning anything delta with id range that includes 
> this open txn id for any txn that is still running.  This has a drawback of 
> potentially delaying GC of old files for arbitrarily long periods.  So this 
> should be a user config choice.   The implementation is not trivial.
> I would go with 1 now and do 2/3 together with multi-statement txn work.
> Side note:  if 2 deltas have overlapping ID range, then 1 must be a subset of 
> the other



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13369) AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing the "best" base file

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375794#comment-15375794
 ] 

Hive QA commented on HIVE-13369:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817558/HIVE-13369.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10316 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/501/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/501/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-501/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12817558 - PreCommit-HIVE-MASTER-Build

> AcidUtils.getAcidState() is not paying attention toValidTxnList when choosing 
> the "best" base file
> --
>
> Key: HIVE-13369
> URL: https://issues.apache.org/jira/browse/HIVE-13369
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-13369.1.patch, HIVE-13369.2.patch, 
> HIVE-13369.3.patch
>
>
> The JavaDoc on getAcidState() reads, in part:
> "Note that because major compactions don't
>preserve the history, we can't use a base directory that includes a
>transaction id that we must exclude."
> which is correct but there is nothing in the code that does this.
> And if we detect a situation where txn X must be excluded but and there are 
> deltas that contain X, we'll have to abort the txn.  This can't (reasonably) 
> happen with auto commit mode, but with multi statement txns it's possible.
> Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An 
> hour later it decides to access some partition for which all txns < 20 (for 
> example) have already been compacted (i.e. GC'd).  
> ==
> Here is a more concrete example.  Let's say the file for table A are as 
> follows and created in the order listed.
> delta_4_4
> delta_5_5
> delta_4_5
> base_5
> delta_16_16
> delta_17_17
> base_17  (for example user ran major compaction)
> let's say getAcidState() is called with ValidTxnList(20:16), i.e. with HWM=20 
> and ExceptionList=<16>
> Assume that all txns <= 20 commit.
> Reader can't use base_17 because it has result of txn16.  So it should chose 
> base_5 "TxnBase bestBase" in _getChildState()_.
> Then the reset of the logic in _getAcidState()_ should choose delta_16_16 and 
> delta_17_17 in _Directory_ object.  This would represent acceptable snapshot 
> for such reader.
> The issue is if at the same time the Cleaner process is running.  It will see 
> everything with txnid<17 as obsolete.  Then it will check lock manger state 
> and decide to delete (as there may not be any locks in LM for table A).  The 
> order in which the files are deleted is undefined right now.  It may delete 
> delta_16_16 and delta_17_17 first and right at this moment the read request 
> with ValidTxnList(20:16) arrives (such snapshot may have bee locked in by 
> some multi-stmt txn that started some time ago.  It acquires locks after the 
> Cleaner checks LM state and calls getAcidState(). This request will choose 
> base_5 but it won't see delta_16_16 and delta_17_17 and thus 

[jira] [Updated] (HIVE-14172) LLAP: force evict blocks by size to handle memory fragmentation

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14172:

   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master and 2.1. Thanks for the review!

> LLAP: force evict blocks by size to handle memory fragmentation
> ---
>
> Key: HIVE-14172
> URL: https://issues.apache.org/jira/browse/HIVE-14172
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14172.01.patch, HIVE-14172.patch
>
>
> In the long run, we should replace buddy allocator with a better scheme. For 
> now do a workaround for fragmentation that cannot be easily resolved. It's 
> still not perfect but works for practical  ORC cases, where we have the 
> default size and smaller blocks, rather than large allocations having trouble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13965) Empty resultset run into Exception when using Thrift Binary Serde

2016-07-13 Thread Ziyang Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375770#comment-15375770
 ] 

Ziyang Zhao commented on HIVE-13965:


Thank you [~vgumashta]!

> Empty resultset run into Exception when using Thrift Binary Serde
> -
>
> Key: HIVE-13965
> URL: https://issues.apache.org/jira/browse/HIVE-13965
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13965.1.patch.txt
>
>
> This error can be reproduced by enabling thrift binary serde, using beeline 
> connect to hiveserver2 and executing the following commands:
> >create table test3(num1 int);
> >create table test4(num1 int);
> >insert into test3 values(1);
> >insert into test4 values(2);
> >select * from test3 join test4 on test3.num1=test4.num1;
> The result should be empty, but it gives an exception:
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:206)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1029)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:641)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:195)
> ... 8 more
> This error is caused in FileSinkOperator.java. 
> If the resultset is empty, function process() will not be called, so variable 
> "fpaths" will not be set. When run into CloseOp(), 
> if (conf.isHiveServerQuery() && HiveConf.getBoolVar(hconf,
>  HiveConf.ConfVars.HIVE_SERVER2_THRIFT_RESULTSET_SERIALIZE_IN_TASKS) 
> &&
>  
> serializer.getClass().getName().equalsIgnoreCase(ThriftJDBCBinarySerDe.class.getName()))
>  {
>  try {
>recordValue = serializer.serialize(null, inputObjInspectors[0]);
>rowOutWriters = fpaths.outWriters;
>rowOutWriters[0].write(recordValue);
>  } catch (SerDeException | IOException e) {
>throw new HiveException(e);
>  }
>  }
> Here fpaths is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13965) Empty resultset run into Exception when using Thrift Binary Serde

2016-07-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13965:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2.1. Thanks [~ziyangz] for the patch!

> Empty resultset run into Exception when using Thrift Binary Serde
> -
>
> Key: HIVE-13965
> URL: https://issues.apache.org/jira/browse/HIVE-13965
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13965.1.patch.txt
>
>
> This error can be reproduced by enabling thrift binary serde, using beeline 
> connect to hiveserver2 and executing the following commands:
> >create table test3(num1 int);
> >create table test4(num1 int);
> >insert into test3 values(1);
> >insert into test4 values(2);
> >select * from test3 join test4 on test3.num1=test4.num1;
> The result should be empty, but it gives an exception:
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:206)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1029)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:641)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:655)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:195)
> ... 8 more
> This error is caused in FileSinkOperator.java. 
> If the resultset is empty, function process() will not be called, so variable 
> "fpaths" will not be set. When run into CloseOp(), 
> if (conf.isHiveServerQuery() && HiveConf.getBoolVar(hconf,
>  HiveConf.ConfVars.HIVE_SERVER2_THRIFT_RESULTSET_SERIALIZE_IN_TASKS) 
> &&
>  
> serializer.getClass().getName().equalsIgnoreCase(ThriftJDBCBinarySerDe.class.getName()))
>  {
>  try {
>recordValue = serializer.serialize(null, inputObjInspectors[0]);
>rowOutWriters = fpaths.outWriters;
>rowOutWriters[0].write(recordValue);
>  } catch (SerDeException | IOException e) {
>throw new HiveException(e);
>  }
>  }
> Here fpaths is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-07-13 Thread Ziyang Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-13723:
---
Attachment: (was: HIVE-13723.3.patch)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.4.patch.txt
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
>  

[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-07-13 Thread Ziyang Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-13723:
---
Attachment: (was: HIVE-13723.2.patch.txt)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.4.patch.txt
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
>  

[jira] [Commented] (HIVE-14191) bump a new api version for ThriftJDBCBinarySerde changes

2016-07-13 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375760#comment-15375760
 ] 

Vaibhav Gumashta commented on HIVE-14191:
-

+1 pending test.

> bump a new api version for ThriftJDBCBinarySerde changes
> 
>
> Key: HIVE-14191
> URL: https://issues.apache.org/jira/browse/HIVE-14191
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Attachments: HIVE-14191.1.patch, HIVE-14191.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-07-13 Thread Ziyang Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-13723:
---
Attachment: HIVE-13723.4.patch.txt

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.2.patch.txt, HIVE-13723.3.patch, 
> HIVE-13723.4.patch.txt
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-14191) bump a new api version for ThriftJDBCBinarySerde changes

2016-07-13 Thread Ziyang Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao updated HIVE-14191:
---
Status: Patch Available  (was: Open)

> bump a new api version for ThriftJDBCBinarySerde changes
> 
>
> Key: HIVE-14191
> URL: https://issues.apache.org/jira/browse/HIVE-14191
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
> Attachments: HIVE-14191.1.patch, HIVE-14191.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14164) JDBC: Add retry in JDBC driver when reading config values from ZK

2016-07-13 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-14164.
-
Resolution: Duplicate

> JDBC: Add retry in JDBC driver when reading config values from ZK
> -
>
> Key: HIVE-14164
> URL: https://issues.apache.org/jira/browse/HIVE-14164
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0, 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> Sometimes ZK may intermittently experience network partitioning. During this 
> time, clients trying to open a JDBC connection get an exception. To improve 
> user experience, we should implement a retry logic and fail after retrying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375745#comment-15375745
 ] 

Sergey Shelukhin commented on HIVE-11402:
-

Also rebased the patch.

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11402) HS2 - add an option to disallow parallel query execution within a single Session

2016-07-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11402:

Attachment: HIVE-11402.04.patch

Added the test as suggested. I am not familiar with the setup of this test so I 
wouldn't vouch for its relevance ;)

> HS2 - add an option to disallow parallel query execution within a single 
> Session
> 
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.03.patch, HIVE-11402.04.patch, HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14196) Disable LLAP IO when complex types are involved

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14196:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks [~sershe] for the review!

> Disable LLAP IO when complex types are involved
> ---
>
> Key: HIVE-14196
> URL: https://issues.apache.org/jira/browse/HIVE-14196
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14196.1.patch, HIVE-14196.2.patch, 
> HIVE-14196.3.patch, HIVE-14196.4.patch
>
>
> Let's exclude vector_complex_* tests added for llap which is currently broken 
> and fails in all test runs. We can re-enable it with HIVE-14089 patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14196) Disable LLAP IO when complex types are involved

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14196:
-
Attachment: HIVE-14196.4.patch

Rebased after some trunk commit to LlapIF.

> Disable LLAP IO when complex types are involved
> ---
>
> Key: HIVE-14196
> URL: https://issues.apache.org/jira/browse/HIVE-14196
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14196.1.patch, HIVE-14196.2.patch, 
> HIVE-14196.3.patch, HIVE-14196.4.patch
>
>
> Let's exclude vector_complex_* tests added for llap which is currently broken 
> and fails in all test runs. We can re-enable it with HIVE-14089 patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14213) Add timeouts for various components in llap status check

2016-07-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14213:
--
   Resolution: Fixed
Fix Version/s: 2.1.1
   Status: Resolved  (was: Patch Available)

Thanks for the review. Committed to master and branch-2.1

> Add timeouts for various components in llap status check
> 
>
> Key: HIVE-14213
> URL: https://issues.apache.org/jira/browse/HIVE-14213
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.1.1
>
> Attachments: HIVE-14213.01.patch, HIVE-14213.02.patch
>
>
> The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
> ZooKeeper. If either of these components are down - the command can take a 
> long time to exit.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14196) Disable LLAP IO when complex types are involved

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375656#comment-15375656
 ] 

Sergey Shelukhin commented on HIVE-14196:
-

+1

> Disable LLAP IO when complex types are involved
> ---
>
> Key: HIVE-14196
> URL: https://issues.apache.org/jira/browse/HIVE-14196
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14196.1.patch, HIVE-14196.2.patch, 
> HIVE-14196.3.patch
>
>
> Let's exclude vector_complex_* tests added for llap which is currently broken 
> and fails in all test runs. We can re-enable it with HIVE-14089 patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14222) PTF: Operator initialization does not clean state

2016-07-13 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14222:
-
Status: Patch Available  (was: Open)

> PTF: Operator initialization does not clean state
> -
>
> Key: HIVE-14222
> URL: https://issues.apache.org/jira/browse/HIVE-14222
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 1.2.1, 2.2.0
>Reporter: Gopal V
>Assignee: Wei Zheng
> Attachments: HIVE-14222.1.patch
>
>
> PTFOperator::initializeOp() does not reset currentKeys to null.
> {code}
>   if (currentKeys != null && !keysAreEqual) {
> ptfInvocation.finishPartition();
>   }
> 
>   if (currentKeys == null) {
>   currentKeys = newKeys.copyKey();
> } else {
>   currentKeys.copyKey(newKeys);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14213) Add timeouts for various components in llap status check

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375651#comment-15375651
 ] 

Sergey Shelukhin commented on HIVE-14213:
-

+1

> Add timeouts for various components in llap status check
> 
>
> Key: HIVE-14213
> URL: https://issues.apache.org/jira/browse/HIVE-14213
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14213.01.patch, HIVE-14213.02.patch
>
>
> The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
> ZooKeeper. If either of these components are down - the command can take a 
> long time to exit.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14213) Add timeouts for various components in llap status check

2016-07-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14213:
--
Attachment: HIVE-14213.02.patch

Updated patch with the config names changed to use "llapcli" instead of 
"llapstatus".

bq. Can you still only set configs if not already set? At least when replacing 
with defaults.
The intent is to avoid the cluster defaults, and setup values for llapstatus so 
that it fails fast - rather than re-trying per the cluster default retry policy.

> Add timeouts for various components in llap status check
> 
>
> Key: HIVE-14213
> URL: https://issues.apache.org/jira/browse/HIVE-14213
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14213.01.patch, HIVE-14213.02.patch
>
>
> The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
> ZooKeeper. If either of these components are down - the command can take a 
> long time to exit.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14198) Refactor aux jar related code to make them more consistent

2016-07-13 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14198:

Status: Patch Available  (was: In Progress)

> Refactor aux jar related code to make them more consistent
> --
>
> Key: HIVE-14198
> URL: https://issues.apache.org/jira/browse/HIVE-14198
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14198.1.patch
>
>
> There are some redundancy and inconsistency between hive.aux.jar.paths and 
> hive.reloadable.aux.jar.paths and also between MR and spark. 
> Refactor the code to reuse the same code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14213) Add timeouts for various components in llap status check

2016-07-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14213:
--
Description: 
The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
ZooKeeper. If either of these components are down - the command can take a long 
time to exit.

NO PRECOMMIT TESTS

  was:The llapstatus check connects to various compoennts - YARN, HDFS via 
Slider, ZooKeeper. If either of these components are down - the command can 
take a long time to exit.


> Add timeouts for various components in llap status check
> 
>
> Key: HIVE-14213
> URL: https://issues.apache.org/jira/browse/HIVE-14213
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14213.01.patch
>
>
> The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
> ZooKeeper. If either of these components are down - the command can take a 
> long time to exit.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14198) Refactor aux jar related code to make them more consistent

2016-07-13 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14198:

Status: In Progress  (was: Patch Available)

> Refactor aux jar related code to make them more consistent
> --
>
> Key: HIVE-14198
> URL: https://issues.apache.org/jira/browse/HIVE-14198
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14198.1.patch
>
>
> There are some redundancy and inconsistency between hive.aux.jar.paths and 
> hive.reloadable.aux.jar.paths and also between MR and spark. 
> Refactor the code to reuse the same code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14172) LLAP: force evict blocks by size to handle memory fragmentation

2016-07-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375643#comment-15375643
 ] 

Gopal V commented on HIVE-14172:


LGTM - +1.

> LLAP: force evict blocks by size to handle memory fragmentation
> ---
>
> Key: HIVE-14172
> URL: https://issues.apache.org/jira/browse/HIVE-14172
> Project: Hive
>  Issue Type: Bug
>Reporter: Nita Dembla
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14172.01.patch, HIVE-14172.patch
>
>
> In the long run, we should replace buddy allocator with a better scheme. For 
> now do a workaround for fragmentation that cannot be easily resolved. It's 
> still not perfect but works for practical  ORC cases, where we have the 
> default size and smaller blocks, rather than large allocations having trouble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14223) beeline should look for jdbc standalone jar in dist/jdbc dir instead of dist/lib

2016-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-14223.
--
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0

Committed to master and branch-2.1.

> beeline should look for jdbc standalone jar in dist/jdbc dir instead of 
> dist/lib
> 
>
> Key: HIVE-14223
> URL: https://issues.apache.org/jira/browse/HIVE-14223
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.1, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14223.1.patch, HIVE-14223.2.patch
>
>
> HIVE-13134 changed the jdbc-standalone jar path to dist/jdbc instead of 
> dist/lib. beeline.sh still looks for the jar in dist/lib which throws the 
> following error
> {code}
> ls: cannot access /work/hive2/lib/hive-jdbc-*-standalone.jar: No such file or 
> directory
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14222) PTF: Operator initialization does not clean state

2016-07-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375624#comment-15375624
 ] 

Gopal V commented on HIVE-14222:


Yes, the plan is not deserialized again for a new split. 

The same operator tree is used but at the end each split it calls closeOp() and 
then calls initializeOp().

This means we can hold onto hashtables and similar bits which were built during 
a previous split.

> PTF: Operator initialization does not clean state
> -
>
> Key: HIVE-14222
> URL: https://issues.apache.org/jira/browse/HIVE-14222
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 1.2.1, 2.2.0
>Reporter: Gopal V
>Assignee: Wei Zheng
> Attachments: HIVE-14222.1.patch
>
>
> PTFOperator::initializeOp() does not reset currentKeys to null.
> {code}
>   if (currentKeys != null && !keysAreEqual) {
> ptfInvocation.finishPartition();
>   }
> 
>   if (currentKeys == null) {
>   currentKeys = newKeys.copyKey();
> } else {
>   currentKeys.copyKey(newKeys);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13989) Extended ACLs are not handled according to specification

2016-07-13 Thread Chris Drome (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375625#comment-15375625
 ] 

Chris Drome commented on HIVE-13989:


[~ashutoshc], [~spena], sorry for the delay in updating details about this 
ticket.

This is a patch that we have had to use internally since 0.13.
I don't have access to a branch-2 cluster, but I can add some notes about how 
to replicate these failures on branch-1 with the version of Hadoop we use.

> Extended ACLs are not handled according to specification
> 
>
> Key: HIVE-13989
> URL: https://issues.apache.org/jira/browse/HIVE-13989
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, 
> HIVE-13989.1.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
> if (aclEnabled) {
>   aclStatus =  sourceStatus.getAclStatus();
>   if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>   }
> }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the directory sub-tree, so the above code is incomplete because it 
> effectively drops the DEFAULT rules. The second problem is with the call to 
> {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
> ACLs. When extended ACLs are used the GROUP permission is replaced with the 
> extended ACL mask. So the above code will apply the wrong permissions to the 
> GROUP. Instead the correct GROUP permissions now need to be pulled from the 
> AclEntry as returned by {{getAclStatus().getEntries()}}. See the 
> implementation of the new method {{getDefaultAclEntries}} for details.
> Similar issues exist with the HCatalog API. None of the API accounts for 
> setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
> API allow the extended ACLs to be passed into the required methods similar to 
> how basic permissions are passed in. When building the directory sub-tree the 
> extended ACLs of the table directory are inherited by all sub-directories, 
> including the DEFAULT rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification

2016-07-13 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-13989:
---
Description: 
Hive takes two approaches to working with extended ACLs depending on whether 
data is being produced via a Hive query or HCatalog APIs. A Hive query will run 
an FsShell command to recursively set the extended ACLs for a directory 
sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
programmatically and runs some code to set the ACLs to match the parent 
directory.

Some incorrect assumptions were made when implementing the extended ACLs 
support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
design documents of extended ACLs in HDFS. These documents model the 
implementation after the POSIX implementation on Linux, which can be found at 
http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.

The code for setting extended ACLs via HCatalog APIs is found in HdfsUtils.java:

{code}
if (aclEnabled) {
  aclStatus =  sourceStatus.getAclStatus();
  if (aclStatus != null) {
LOG.trace(aclStatus.toString());
aclEntries = aclStatus.getEntries();
removeBaseAclEntries(aclEntries);

//the ACL api's also expect the tradition user/group/other permission 
in the form of ACL
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
sourcePerm.getUserAction()));
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
sourcePerm.getGroupAction()));
aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
sourcePerm.getOtherAction()));
  }
}
{code}

We found that DEFAULT extended ACL rules were not being inherited properly by 
the directory sub-tree, so the above code is incomplete because it effectively 
drops the DEFAULT rules. The second problem is with the call to 
{{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended 
ACLs. When extended ACLs are used the GROUP permission is replaced with the 
extended ACL mask. So the above code will apply the wrong permissions to the 
GROUP. Instead the correct GROUP permissions now need to be pulled from the 
AclEntry as returned by {{getAclStatus().getEntries()}}. See the implementation 
of the new method {{getDefaultAclEntries}} for details.

Similar issues exist with the HCatalog API. None of the API accounts for 
setting extended ACLs on the directory sub-tree. The changes to the HCatalog 
API allow the extended ACLs to be passed into the required methods similar to 
how basic permissions are passed in. When building the directory sub-tree the 
extended ACLs of the table directory are inherited by all sub-directories, 
including the DEFAULT rules.

> Extended ACLs are not handled according to specification
> 
>
> Key: HIVE-13989
> URL: https://issues.apache.org/jira/browse/HIVE-13989
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, 
> HIVE-13989.1.patch
>
>
> Hive takes two approaches to working with extended ACLs depending on whether 
> data is being produced via a Hive query or HCatalog APIs. A Hive query will 
> run an FsShell command to recursively set the extended ACLs for a directory 
> sub-tree. HCatalog APIs will attempt to build up the directory sub-tree 
> programmatically and runs some code to set the ACLs to match the parent 
> directory.
> Some incorrect assumptions were made when implementing the extended ACLs 
> support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the 
> design documents of extended ACLs in HDFS. These documents model the 
> implementation after the POSIX implementation on Linux, which can be found at 
> http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html.
> The code for setting extended ACLs via HCatalog APIs is found in 
> HdfsUtils.java:
> {code}
> if (aclEnabled) {
>   aclStatus =  sourceStatus.getAclStatus();
>   if (aclStatus != null) {
> LOG.trace(aclStatus.toString());
> aclEntries = aclStatus.getEntries();
> removeBaseAclEntries(aclEntries);
> //the ACL api's also expect the tradition user/group/other permission 
> in the form of ACL
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, 
> sourcePerm.getUserAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, 
> sourcePerm.getGroupAction()));
> aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, 
> sourcePerm.getOtherAction()));
>   }
> }
> {code}
> We found that DEFAULT extended ACL rules were not being inherited properly by 
> the 

[jira] [Commented] (HIVE-14215) Displaying inconsistent CPU usage data with MR execution engine

2016-07-13 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375604#comment-15375604
 ] 

Aihua Xu commented on HIVE-14215:
-

I see. It's missing the last reading of the counter after the execution 
finishes. 

+1. 

> Displaying inconsistent CPU usage data with MR execution engine
> ---
>
> Key: HIVE-14215
> URL: https://issues.apache.org/jira/browse/HIVE-14215
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14215.patch
>
>
> If the MR task is finished after printing the cumulative CPU time then there 
> is the possibility to print inconsistent CPU usage information.
> Correct one:
> {noformat}
> 2016-07-12 11:31:42,961 Stage-3 map = 0%,  reduce = 0%
> 2016-07-12 11:31:48,237 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.5 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 500 msec
> Ended Job = job_1468321038188_0003
> MapReduce Jobs Launched: 
> Stage-Stage-3: Map: 1   Cumulative CPU: 2.5 sec   HDFS Read: 5864 HDFS Write: 
> 103 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 500 msec
> {noformat}
> One type of inconsistent data (easily reproducible one):
> {noformat}
> 2016-07-12 11:39:00,540 Stage-3 map = 0%,  reduce = 0%
> Ended Job = job_1468321038188_0004
> MapReduce Jobs Launched: 
> Stage-Stage-3: Map: 1   Cumulative CPU: 2.51 sec   HDFS Read: 5864 HDFS 
> Write: 103 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 510 msec
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14221) set SQLStdHiveAuthorizerFactoryForTest as default HIVE_AUTHORIZATION_MANAGER

2016-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375592#comment-15375592
 ] 

Hive QA commented on HIVE-14221:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12817553/HIVE-14221.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 534 failed/errored test(s), 10299 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_skewtable
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testCheckPermissions
org.apache.hadoop.hive.llap.daemon.impl.TestLlapTokenChecker.testGetToken
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.ql.TestTxnCommands.exchangePartition
org.apache.hadoop.hive.ql.TestTxnCommands.testDelete
org.apache.hadoop.hive.ql.TestTxnCommands.testDeleteIn
org.apache.hadoop.hive.ql.TestTxnCommands.testErrors
org.apache.hadoop.hive.ql.TestTxnCommands.testExplicitRollback
org.apache.hadoop.hive.ql.TestTxnCommands.testImplicitRollback
org.apache.hadoop.hive.ql.TestTxnCommands.testInsertOverwrite
org.apache.hadoop.hive.ql.TestTxnCommands.testMultipleDelete
org.apache.hadoop.hive.ql.TestTxnCommands.testMultipleInserts
org.apache.hadoop.hive.ql.TestTxnCommands.testReadMyOwnInsert
org.apache.hadoop.hive.ql.TestTxnCommands.testSimpleAcidInsert
org.apache.hadoop.hive.ql.TestTxnCommands.testTimeOutReaper
org.apache.hadoop.hive.ql.TestTxnCommands.testUpdateDeleteOfInserts
org.apache.hadoop.hive.ql.TestTxnCommands.testUpdateOfInserts
org.apache.hadoop.hive.ql.TestTxnCommands2.testAlterTable
org.apache.hadoop.hive.ql.TestTxnCommands2.testBucketizedInputFormat
org.apache.hadoop.hive.ql.TestTxnCommands2.testDeleteIn
org.apache.hadoop.hive.ql.TestTxnCommands2.testFailHeartbeater
org.apache.hadoop.hive.ql.TestTxnCommands2.testFileSystemUnCaching
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hadoop.hive.ql.TestTxnCommands2.testInsertOverwriteWithSelfJoin
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion1
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion2
org.apache.hadoop.hive.ql.TestTxnCommands2.testNonAcidToAcidConversion3
org.apache.hadoop.hive.ql.TestTxnCommands2.testOpenTxnsCounter
org.apache.hadoop.hive.ql.TestTxnCommands2.testOrcNoPPD
org.apache.hadoop.hive.ql.TestTxnCommands2.testOrcPPD
org.apache.hadoop.hive.ql.TestTxnCommands2.testUpdateMixedCase
org.apache.hadoop.hive.ql.TestTxnCommands2.testValidTxnsBookkeeping
org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned
org.apache.hadoop.hive.ql.TestTxnCommands2.writeBetweenWorkerAndCleaner
org.apache.hadoop.hive.ql.exec.TestExecDriver.initializationError
org.apache.hadoop.hive.ql.exec.TestOperators.testFetchOperatorContext
org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperator
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testBuildDag
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testEmptyWork
org.apache.hadoop.hive.ql.hooks.TestHooks.org.apache.hadoop.hive.ql.hooks.TestHooks
org.apache.hadoop.hive.ql.io.TestSymlinkTextInputFormat.testCombine
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLExclusive
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLNoLock
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDDLShared
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testDelete
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testExceptions
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testHeartbeater
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testJoin
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockAcquisitionAndRelease
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testLockTimeout
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testReadWrite
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testRollback
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadMultiPartition
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadPartition
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager.testSingleReadTable

[jira] [Comment Edited] (HIVE-14214) ORC Schema Evolution and Predicate Push Down do not work together (no rows returned)

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375539#comment-15375539
 ] 

Sergey Shelukhin edited comment on HIVE-14214 at 7/13/16 7:01 PM:
--

HBase metadata PPD feature is not complete right now (or rather it should work 
and is somewhat usable but it's alpha with HBase metastore and there's 
convenience and testing to be added). Unless I am missing something it should 
be ok to break it as long as it breaks in a reasonable manner (e.g. throws 
suggesting the user not enable it, it's off by default), and there's a followup 
JIRA


was (Author: sershe):
HBase PPD feature is not complete right now. Unless I am missing something it 
should be ok to break it as long as it breaks in a reasonable manner (e.g. 
throws suggesting the user not enable it, it's off by default), and there's a 
followup JIRA

> ORC Schema Evolution and Predicate Push Down do not work together (no rows 
> returned)
> 
>
> Key: HIVE-14214
> URL: https://issues.apache.org/jira/browse/HIVE-14214
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14214.WIP.patch
>
>
> In Schema Evolution, the reader schema is different than the file schema 
> which is used to evaluate predicate push down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14214) ORC Schema Evolution and Predicate Push Down do not work together (no rows returned)

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375539#comment-15375539
 ] 

Sergey Shelukhin commented on HIVE-14214:
-

HBase PPD feature is not complete right now. Unless I am missing something it 
should be ok to break it as long as it breaks in a reasonable manner (e.g. 
throws suggesting the user not enable it, it's off by default), and there's a 
followup JIRA

> ORC Schema Evolution and Predicate Push Down do not work together (no rows 
> returned)
> 
>
> Key: HIVE-14214
> URL: https://issues.apache.org/jira/browse/HIVE-14214
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14214.WIP.patch
>
>
> In Schema Evolution, the reader schema is different than the file schema 
> which is used to evaluate predicate push down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375529#comment-15375529
 ] 

Sergey Shelukhin commented on HIVE-14077:
-

Well, HIVE-13945 has the motivation to change it... :) And there are other 
similar issues (that's why we reverted HIVE-13380 from 2.1 before 2.1 release - 
it broke some TPCDS/H queries due to double arithmetic).
Also I thought Hive does strive to be ANSI compliant? :)
Wrt special form, that is preserved, I also added 0.06D for double.

> revert or fix HIVE-13380
> 
>
> Key: HIVE-14077
> URL: https://issues.apache.org/jira/browse/HIVE-14077
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> See comments in that JIRA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14213) Add timeouts for various components in llap status check

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375523#comment-15375523
 ] 

Sergey Shelukhin commented on HIVE-14213:
-

Can you still only set configs if not already set? At least when replacing with 
defaults.
Otherwise +1 ... also can you file a JIRA to remove those?

> Add timeouts for various components in llap status check
> 
>
> Key: HIVE-14213
> URL: https://issues.apache.org/jira/browse/HIVE-14213
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14213.01.patch
>
>
> The llapstatus check connects to various compoennts - YARN, HDFS via Slider, 
> ZooKeeper. If either of these components are down - the command can take a 
> long time to exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380

2016-07-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375524#comment-15375524
 ] 

Xuefu Zhang commented on HIVE-14077:


Thanks for the clarification. Regarding ANSI, we don't claim that Hive is ANSI 
compliant. Hive is already treating 0.06 as double always, and I'd say b/c is 
more important than a motivation to make it ANSI-compliant. Again, Hive uses 
0.06BD as decimal literal.

> revert or fix HIVE-13380
> 
>
> Key: HIVE-14077
> URL: https://issues.apache.org/jira/browse/HIVE-14077
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> See comments in that JIRA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380

2016-07-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375515#comment-15375515
 ] 

Sergey Shelukhin commented on HIVE-14077:
-

Well, per ANSI literals like 0.06 should be treated as decimal. Treating them 
double in filters also causes surprising query results.
In fact, HIVE-13945 changed the treatment on master. I think the idea of this 
JIRA was now to add a test. We also need to check that all the active branches 
either have both or neither of these patches, for consistency.

> revert or fix HIVE-13380
> 
>
> Key: HIVE-14077
> URL: https://issues.apache.org/jira/browse/HIVE-14077
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> See comments in that JIRA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380

2016-07-13 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375512#comment-15375512
 ] 

Xuefu Zhang commented on HIVE-14077:


Sorry for chiming in late, but I'm wonder what's focal debate here?

As [~jdere] pointed out, Hive has a specific decimal syntax, and a numeric such 
as 0.06 is treated as double. I don't see anything wrong with that. Making 0.06 
literal as decimal is another concern for b/c.

Regarding operation ordering, I think we all agree that nonexact type operating 
with exact type produces nonexact time. This should apply to comparison as 
well. However, b/c is a valid concern and should be called out as pointed out 
in HIVE-13380.

> revert or fix HIVE-13380
> 
>
> Key: HIVE-14077
> URL: https://issues.apache.org/jira/browse/HIVE-14077
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> See comments in that JIRA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14223) beeline should look for jdbc standalone jar in dist/jdbc dir instead of dist/lib

2016-07-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375495#comment-15375495
 ] 

Siddharth Seth commented on HIVE-14223:
---

+1

> beeline should look for jdbc standalone jar in dist/jdbc dir instead of 
> dist/lib
> 
>
> Key: HIVE-14223
> URL: https://issues.apache.org/jira/browse/HIVE-14223
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.1, 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14223.1.patch, HIVE-14223.2.patch
>
>
> HIVE-13134 changed the jdbc-standalone jar path to dist/jdbc instead of 
> dist/lib. beeline.sh still looks for the jar in dist/lib which throws the 
> following error
> {code}
> ls: cannot access /work/hive2/lib/hive-jdbc-*-standalone.jar: No such file or 
> directory
> {code}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14219) LLAP external client on secure cluster: Protocol interface org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol is not known

2016-07-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14219:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master

> LLAP external client on secure cluster: Protocol interface 
> org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol is not known
> --
>
> Key: HIVE-14219
> URL: https://issues.apache.org/jira/browse/HIVE-14219
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.2.0
>
> Attachments: HIVE-14219.1.patch
>
>
> {noformat}
> 2016-07-07T23:10:35,249 INFO  [TaskHeartbeatThread[]]: task.TezTaskRunner2 
> (:()) - TaskReporter reporter error which will cause the task to fail
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
>  Protocol interface 
> org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol is not known.
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1551)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1495)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1395)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:241)
>   at com.sun.proxy.$Proxy39.heartbeat(Unknown Source)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:280)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:202)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:139)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14222) PTF: Operator initialization does not clean state

2016-07-13 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14222:
-
Attachment: HIVE-14222.1.patch

> PTF: Operator initialization does not clean state
> -
>
> Key: HIVE-14222
> URL: https://issues.apache.org/jira/browse/HIVE-14222
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 1.2.1, 2.2.0
>Reporter: Gopal V
>Assignee: Wei Zheng
> Attachments: HIVE-14222.1.patch
>
>
> PTFOperator::initializeOp() does not reset currentKeys to null.
> {code}
>   if (currentKeys != null && !keysAreEqual) {
> ptfInvocation.finishPartition();
>   }
> 
>   if (currentKeys == null) {
>   currentKeys = newKeys.copyKey();
> } else {
>   currentKeys.copyKey(newKeys);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14222) PTF: Operator initialization does not clean state

2016-07-13 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375474#comment-15375474
 ] 

Wei Zheng commented on HIVE-14222:
--

[~ashutoshc] Tez keeps an object registry / object cache. That explains why 
we're not necessarily creating new objects every time.

> PTF: Operator initialization does not clean state
> -
>
> Key: HIVE-14222
> URL: https://issues.apache.org/jira/browse/HIVE-14222
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 1.2.1, 2.2.0
>Reporter: Gopal V
>Assignee: Wei Zheng
>
> PTFOperator::initializeOp() does not reset currentKeys to null.
> {code}
>   if (currentKeys != null && !keysAreEqual) {
> ptfInvocation.finishPartition();
>   }
> 
>   if (currentKeys == null) {
>   currentKeys = newKeys.copyKey();
> } else {
>   currentKeys.copyKey(newKeys);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14222) PTF: Operator initialization does not clean state

2016-07-13 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375456#comment-15375456
 ] 

Wei Zheng commented on HIVE-14222:
--

[~gopalv] Can you briefly put some explanation on how container reuse works 
here? I talked to [~ashutoshc], he's not very convinced how that can happen, 
given the understanding of how operator pipeline works - every time we are 
creating a new object even using the same jvm.

> PTF: Operator initialization does not clean state
> -
>
> Key: HIVE-14222
> URL: https://issues.apache.org/jira/browse/HIVE-14222
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 1.2.1, 2.2.0
>Reporter: Gopal V
>Assignee: Wei Zheng
>
> PTFOperator::initializeOp() does not reset currentKeys to null.
> {code}
>   if (currentKeys != null && !keysAreEqual) {
> ptfInvocation.finishPartition();
>   }
> 
>   if (currentKeys == null) {
>   currentKeys = newKeys.copyKey();
> } else {
>   currentKeys.copyKey(newKeys);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14221) set SQLStdHiveAuthorizerFactoryForTest as default HIVE_AUTHORIZATION_MANAGER

2016-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375444#comment-15375444
 ] 

Pengcheng Xiong commented on HIVE-14221:


[~ashutoshc], The above code path will be hit if materialize cte is off. If 
materialize cte is on, the query is rewritten as scan from a temporary table. 
However, that "temporary" table does not exist in the 
sessionhivemetastoreclient. 

> set SQLStdHiveAuthorizerFactoryForTest as default HIVE_AUTHORIZATION_MANAGER
> 
>
> Key: HIVE-14221
> URL: https://issues.apache.org/jira/browse/HIVE-14221
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-14221.01.patch, HIVE-14221.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14180) Disable LlapZookeeperRegistry ZK auth setup for external clients

2016-07-13 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14180:
--
Attachment: HIVE-14180.2.patch

rebasing patch

> Disable LlapZookeeperRegistry ZK auth setup for external clients
> 
>
> Key: HIVE-14180
> URL: https://issues.apache.org/jira/browse/HIVE-14180
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14180.02.patch, HIVE-14180.1.patch, 
> HIVE-14180.2.patch
>
>
> {noformat}
> Caused by: org.apache.hadoop.service.ServiceStateException: 
> java.io.IOException: Llap Kerberos keytab is empty
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
> at 
> org.apache.hadoop.hive.llap.registry.impl.LlapRegistryService.getClient(LlapRegistryService.java:67)
> at 
> org.apache.hadoop.hive.llap.LlapBaseInputFormat.getServiceInstance(LlapBaseInputFormat.java:238)
> at 
> org.apache.hadoop.hive.llap.LlapBaseInputFormat.getRecordReader(LlapBaseInputFormat.java:142)
> at 
> org.apache.hadoop.hive.llap.LlapRowInputFormat.getRecordReader(LlapRowInputFormat.java:51)
> {noformat}
> Using the LLAP ZK registry in environments other than the LLAP daemon (such 
> as external LLAP clients), there should be a way to skip this ZK auth setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >