[jira] [Commented] (HIVE-13901) Hivemetastore add partitions can be slow depending on filesystems

2016-06-16 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335474#comment-15335474
 ] 

Rajesh Balamohan commented on HIVE-13901:
-

Wrong patch got uploaded in RB causing the issue. Earlier patch ended up 
removing TP changes in add_partitions_pspec_core accidently. Uploaded the 
correct patch in RB.

> Hivemetastore add partitions can be slow depending on filesystems
> -
>
> Key: HIVE-13901
> URL: https://issues.apache.org/jira/browse/HIVE-13901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-13901.1.patch, HIVE-13901.2.patch
>
>
> Depending on FS, creating external tables & adding partitions can be 
> expensive (e.g msck which adds all partitions).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13901) Hivemetastore add partitions can be slow depending on filesystems

2016-06-16 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335474#comment-15335474
 ] 

Rajesh Balamohan edited comment on HIVE-13901 at 6/17/16 5:58 AM:
--

Wrong patch got uploaded in RB causing the issue earlier. Previous patch ended 
up removing TP changes in add_partitions_pspec_core accidentally. Uploaded the 
correct patch in RB.


was (Author: rajesh.balamohan):
Wrong patch got uploaded in RB causing the issue. Earlier patch ended up 
removing TP changes in add_partitions_pspec_core accidently. Uploaded the 
correct patch in RB.

> Hivemetastore add partitions can be slow depending on filesystems
> -
>
> Key: HIVE-13901
> URL: https://issues.apache.org/jira/browse/HIVE-13901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-13901.1.patch, HIVE-13901.2.patch
>
>
> Depending on FS, creating external tables & adding partitions can be 
> expensive (e.g msck which adds all partitions).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14029) Update Spark version to 2.0.0

2016-06-16 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-14029:
---

Assignee: Ferdinand Xu

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14023:
--
Attachment: HIVE-14023.02.patch

Updated patch to address rb feedback.

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch, HIVE-14023.02.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14013) Describe table doesn't show unicode properly

2016-06-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335345#comment-15335345
 ] 

Hive QA commented on HIVE-14013:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12811140/HIVE-14013.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10234 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_repair
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unicode_notation
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_table_nonprintable
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testTableCheck
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/145/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/145/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-145/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12811140 - PreCommit-HIVE-MASTER-Build

> Describe table doesn't show unicode properly
> 
>
> Key: HIVE-14013
> URL: https://issues.apache.org/jira/browse/HIVE-14013
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14013.1.patch, HIVE-14013.2.patch
>
>
> Describe table output will show comments incorrectly rather than the unicode 
> itself.
> {noformat}
> hive> desc formatted t1;
> # Detailed Table Information 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> comment \u8868\u4E2D\u6587\u6D4B\u8BD5
> numFiles0   
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14038) miscellaneous acid improvements

2016-06-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14038:
--
Status: Patch Available  (was: Open)

> miscellaneous acid improvements
> ---
>
> Key: HIVE-14038
> URL: https://issues.apache.org/jira/browse/HIVE-14038
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14038.patch
>
>
> 1. fix thread name inHouseKeeperServiceBase (currently they are all 
> "org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0")
> 2. dump metastore configs from HiveConf on start up to help record values of 
> properties
> 3. add some tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14038) miscellaneous acid improvements

2016-06-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14038:
--
Attachment: HIVE-14038.patch

> miscellaneous acid improvements
> ---
>
> Key: HIVE-14038
> URL: https://issues.apache.org/jira/browse/HIVE-14038
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14038.patch
>
>
> 1. fix thread name inHouseKeeperServiceBase (currently they are all 
> "org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0")
> 2. dump metastore configs from HiveConf on start up to help record values of 
> properties
> 3. add some tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13999) Cannot modify mapreduce.job.name at runtime when hive security authorization is enabled

2016-06-16 Thread gaojun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335229#comment-15335229
 ] 

gaojun edited comment on HIVE-13999 at 6/17/16 2:21 AM:


@Sergio Peña
Many Thanks ! I solved the issue by set 
hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization


was (Author: bendan.0...@163.com):
@Sergio Peña
Many Thanks ! I solved the issue by set 
'''hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*'''
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization

> Cannot modify mapreduce.job.name at runtime when hive security authorization 
> is enabled 
> 
>
> Key: HIVE-13999
> URL: https://issues.apache.org/jira/browse/HIVE-13999
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Beeline
>Affects Versions: 0.14.0
> Environment: centos 6.5
> ranger 0.4.0
> hive 0.14.0
>Reporter: gaojun
>Assignee: gaojun
>  Labels: security
>
> Cannot set mapreduce.job.name at runtime when hive security authorization is 
> enabled !
> I use ranger and enabled hiveserver2 and hivecli security authorization.
> I use hivecli and I can set mapreduce.job.name property by 'set 
> mapreduce.job.name=job1 ' .
> I use beeline and connect the security hiveserver2 , then I run 'set 
> mapreduce.job.name=job1 ' but it is wrong , I get the exception like this :
> Error: Error while processing statement: Cannot modify mapreduce.job.name at 
> runtime. It is not in list of params that are allowed to be modified at 
> runtime (state=42000,code=1).
> so what`s wrong with it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13999) Cannot modify mapreduce.job.name at runtime when hive security authorization is enabled

2016-06-16 Thread gaojun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335229#comment-15335229
 ] 

gaojun edited comment on HIVE-13999 at 6/17/16 2:20 AM:


@Sergio Peña
Many Thanks ! I solved the issue by set 
'hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*'
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization


was (Author: bendan.0...@163.com):
@Sergio Peña
Many Thanks ! I solved the issue by set 
hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization

> Cannot modify mapreduce.job.name at runtime when hive security authorization 
> is enabled 
> 
>
> Key: HIVE-13999
> URL: https://issues.apache.org/jira/browse/HIVE-13999
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Beeline
>Affects Versions: 0.14.0
> Environment: centos 6.5
> ranger 0.4.0
> hive 0.14.0
>Reporter: gaojun
>Assignee: gaojun
>  Labels: security
>
> Cannot set mapreduce.job.name at runtime when hive security authorization is 
> enabled !
> I use ranger and enabled hiveserver2 and hivecli security authorization.
> I use hivecli and I can set mapreduce.job.name property by 'set 
> mapreduce.job.name=job1 ' .
> I use beeline and connect the security hiveserver2 , then I run 'set 
> mapreduce.job.name=job1 ' but it is wrong , I get the exception like this :
> Error: Error while processing statement: Cannot modify mapreduce.job.name at 
> runtime. It is not in list of params that are allowed to be modified at 
> runtime (state=42000,code=1).
> so what`s wrong with it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13999) Cannot modify mapreduce.job.name at runtime when hive security authorization is enabled

2016-06-16 Thread gaojun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335229#comment-15335229
 ] 

gaojun edited comment on HIVE-13999 at 6/17/16 2:20 AM:


@Sergio Peña
Many Thanks ! I solved the issue by set 
'''hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*'''
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization


was (Author: bendan.0...@163.com):
@Sergio Peña
Many Thanks ! I solved the issue by set 
'hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*'
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization

> Cannot modify mapreduce.job.name at runtime when hive security authorization 
> is enabled 
> 
>
> Key: HIVE-13999
> URL: https://issues.apache.org/jira/browse/HIVE-13999
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Beeline
>Affects Versions: 0.14.0
> Environment: centos 6.5
> ranger 0.4.0
> hive 0.14.0
>Reporter: gaojun
>Assignee: gaojun
>  Labels: security
>
> Cannot set mapreduce.job.name at runtime when hive security authorization is 
> enabled !
> I use ranger and enabled hiveserver2 and hivecli security authorization.
> I use hivecli and I can set mapreduce.job.name property by 'set 
> mapreduce.job.name=job1 ' .
> I use beeline and connect the security hiveserver2 , then I run 'set 
> mapreduce.job.name=job1 ' but it is wrong , I get the exception like this :
> Error: Error while processing statement: Cannot modify mapreduce.job.name at 
> runtime. It is not in list of params that are allowed to be modified at 
> runtime (state=42000,code=1).
> so what`s wrong with it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13999) Cannot modify mapreduce.job.name at runtime when hive security authorization is enabled

2016-06-16 Thread gaojun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojun resolved HIVE-13999.
---
Resolution: Fixed
  Assignee: gaojun

> Cannot modify mapreduce.job.name at runtime when hive security authorization 
> is enabled 
> 
>
> Key: HIVE-13999
> URL: https://issues.apache.org/jira/browse/HIVE-13999
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Beeline
>Affects Versions: 0.14.0
> Environment: centos 6.5
> ranger 0.4.0
> hive 0.14.0
>Reporter: gaojun
>Assignee: gaojun
>  Labels: security
>
> Cannot set mapreduce.job.name at runtime when hive security authorization is 
> enabled !
> I use ranger and enabled hiveserver2 and hivecli security authorization.
> I use hivecli and I can set mapreduce.job.name property by 'set 
> mapreduce.job.name=job1 ' .
> I use beeline and connect the security hiveserver2 , then I run 'set 
> mapreduce.job.name=job1 ' but it is wrong , I get the exception like this :
> Error: Error while processing statement: Cannot modify mapreduce.job.name at 
> runtime. It is not in list of params that are allowed to be modified at 
> runtime (state=42000,code=1).
> so what`s wrong with it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13999) Cannot modify mapreduce.job.name at runtime when hive security authorization is enabled

2016-06-16 Thread gaojun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335229#comment-15335229
 ] 

gaojun commented on HIVE-13999:
---

@Sergio Peña
Many Thanks ! I solved the issue by set 
hive.security.authorization.sqlstd.confwhitelist.append=|mapreduce.job.*|mapreduce.map.*|mapreduce.reduce.*
 .
>From the document I know hs2 limits set parameters when security is enabled . 
>By default mapreduce.job.name is not in param 
>hive.security.authorization.sqlstd.confwhitelist .

https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization

> Cannot modify mapreduce.job.name at runtime when hive security authorization 
> is enabled 
> 
>
> Key: HIVE-13999
> URL: https://issues.apache.org/jira/browse/HIVE-13999
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Beeline
>Affects Versions: 0.14.0
> Environment: centos 6.5
> ranger 0.4.0
> hive 0.14.0
>Reporter: gaojun
>  Labels: security
>
> Cannot set mapreduce.job.name at runtime when hive security authorization is 
> enabled !
> I use ranger and enabled hiveserver2 and hivecli security authorization.
> I use hivecli and I can set mapreduce.job.name property by 'set 
> mapreduce.job.name=job1 ' .
> I use beeline and connect the security hiveserver2 , then I run 'set 
> mapreduce.job.name=job1 ' but it is wrong , I get the exception like this :
> Error: Error while processing statement: Cannot modify mapreduce.job.name at 
> runtime. It is not in list of params that are allowed to be modified at 
> runtime (state=42000,code=1).
> so what`s wrong with it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13648) ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, or DECIMAL when maxLength or precision/scale is different

2016-06-16 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335183#comment-15335183
 ] 

Matt McCline commented on HIVE-13648:
-

Committed to master.

> ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, 
> or DECIMAL when maxLength or precision/scale is different
> --
>
> Key: HIVE-13648
> URL: https://issues.apache.org/jira/browse/HIVE-13648
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-13648.01.patch, HIVE-13648.02.patch, 
> HIVE-13648.03.patch, HIVE-13648.04.patch
>
>
> E.g. when a data file is copied in has a VARCHAR maxLength that doesn't match 
> the DDL's maxLength.  This error is produced:
> {code}
> java.io.IOException: ORC does not support type conversion from file type 
> varchar(145) (36) to reader type varchar(114) (36)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13258) LLAP: Add hdfs bytes read and spilled bytes to tez print summary

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13258:
-
Attachment: HIVE-13258.5.patch

Removed the map from thread pool per [~sseth]'s comment.

> LLAP: Add hdfs bytes read and spilled bytes to tez print summary
> 
>
> Key: HIVE-13258
> URL: https://issues.apache.org/jira/browse/HIVE-13258
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13258.1.patch, HIVE-13258.1.patch, 
> HIVE-13258.2.patch, HIVE-13258.3.patch, HIVE-13258.4.patch, 
> HIVE-13258.5.patch, llap-fs-counters-full-cache-hit.png, llap-fs-counters.png
>
>
> When printing counters to console it will be useful to print hdfs bytes read 
> and spilled bytes which will help with debugging issues faster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13648) ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, or DECIMAL when maxLength or precision/scale is different

2016-06-16 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13648:

Fix Version/s: 2.2.0

> ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, 
> or DECIMAL when maxLength or precision/scale is different
> --
>
> Key: HIVE-13648
> URL: https://issues.apache.org/jira/browse/HIVE-13648
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-13648.01.patch, HIVE-13648.02.patch, 
> HIVE-13648.03.patch, HIVE-13648.04.patch
>
>
> E.g. when a data file is copied in has a VARCHAR maxLength that doesn't match 
> the DDL's maxLength.  This error is produced:
> {code}
> java.io.IOException: ORC does not support type conversion from file type 
> varchar(145) (36) to reader type varchar(114) (36)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13648) ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, or DECIMAL when maxLength or precision/scale is different

2016-06-16 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335166#comment-15335166
 ] 

Matt McCline commented on HIVE-13648:
-

Failures are unrelated.

> ORC Schema Evolution doesn't support same type conversion for VARCHAR, CHAR, 
> or DECIMAL when maxLength or precision/scale is different
> --
>
> Key: HIVE-13648
> URL: https://issues.apache.org/jira/browse/HIVE-13648
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-13648.01.patch, HIVE-13648.02.patch, 
> HIVE-13648.03.patch, HIVE-13648.04.patch
>
>
> E.g. when a data file is copied in has a VARCHAR maxLength that doesn't match 
> the DDL's maxLength.  This error is produced:
> {code}
> java.io.IOException: ORC does not support type conversion from file type 
> varchar(145) (36) to reader type varchar(114) (36)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14012) some ColumnVector-s are missing ensureSize

2016-06-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14012:

Attachment: HIVE-14012.01.patch

Same patch for HiveQA. I checked, it looks like we do not need to ensureSize on 
child vectors, it's done elsewhere.

> some ColumnVector-s are missing ensureSize
> --
>
> Key: HIVE-14012
> URL: https://issues.apache.org/jira/browse/HIVE-14012
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14012.01.patch, HIVE-14012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14041) llap scripts add hadoop and other libraries from the machine local install to the daemon classpath

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14041:
--
Attachment: HIVE-14041.01.patch

Patch gets rid of the hadoop classpath invocation.
Also removes the requirement for HADOOP_CONF_DIR to be specified.

[~gopalv] - could you please take a look.

> llap scripts add hadoop and other libraries from the machine local install to 
> the daemon classpath
> --
>
> Key: HIVE-14041
> URL: https://issues.apache.org/jira/browse/HIVE-14041
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14041.01.patch
>
>
> `hadoop classpath` ends up getting added to the classpath of llap daemons. 
> This essentially means picking up the classpath from the local deploy.
> This isn't required since the slider package includes relevant libraries 
> (shipped from the client)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14041) llap scripts add hadoop and other libraries from the machine local install to the daemon classpath

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14041:
--
Status: Patch Available  (was: Open)

> llap scripts add hadoop and other libraries from the machine local install to 
> the daemon classpath
> --
>
> Key: HIVE-14041
> URL: https://issues.apache.org/jira/browse/HIVE-14041
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14041.01.patch
>
>
> `hadoop classpath` ends up getting added to the classpath of llap daemons. 
> This essentially means picking up the classpath from the local deploy.
> This isn't required since the slider package includes relevant libraries 
> (shipped from the client)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14003) queries running against llap hang at times - preemption issues

2016-06-16 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335111#comment-15335111
 ] 

Prasanth Jayachandran commented on HIVE-14003:
--

Can you make the TODOs as follow up jiras? nit: "KKK" can be removed. Also 
"Reviewer" comments.
Other than that LGTM +1

> queries running against llap hang at times - preemption issues
> --
>
> Key: HIVE-14003
> URL: https://issues.apache.org/jira/browse/HIVE-14003
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Assignee: Siddharth Seth
> Attachments: HIVE-14003.01.patch
>
>
> The preemption logic in the Hive processor needs some more work. There are 
> definitely windows where the abort flag is completely dropped within the Hive 
> processor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-06-16 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335109#comment-15335109
 ] 

Eugene Koifman commented on HIVE-13974:
---

[~mmccline] fyi, patch 4 doesn't include the version of in 
testCompactWithDelete()
https://issues.apache.org/jira/browse/HIVE-13974?focusedCommentId=15332304=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15332304


> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13913) LLAP: introduce backpressure to recordreader

2016-06-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13913:

Attachment: HIVE-13913.03.patch

Fixing thread safety issues, and a stupid error, in the last patch

> LLAP: introduce backpressure to recordreader
> 
>
> Key: HIVE-13913
> URL: https://issues.apache.org/jira/browse/HIVE-13913
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13913.01.patch, HIVE-13913.02.patch, 
> HIVE-13913.03.patch, HIVE-13913.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13970) refactor LLAPIF splits - get rid of SubmitWorkInfo

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335072#comment-15335072
 ] 

Sergey Shelukhin commented on HIVE-13970:
-

Well, I mostly want to get rid of writables and/or nested writables.

> refactor LLAPIF splits - get rid of SubmitWorkInfo
> --
>
> Key: HIVE-13970
> URL: https://issues.apache.org/jira/browse/HIVE-13970
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13970.01.patch, HIVE-13970.only.patch, 
> HIVE-13970.patch, HIVE-13970.patch
>
>
> First we build the signable vertex spec, convert it into bytes (as we 
> should), and put it inside SubmitWorkInfo. Then we serialize that into byte[] 
> and put it into LlapInputSplit. Then we serialize that to return... We should 
> get rid of one of the steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14003) queries running against llap hang at times - preemption issues

2016-06-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335050#comment-15335050
 ] 

Siddharth Seth commented on HIVE-14003:
---

Which ones specifically ?
My intent was to fix the rest of the TODOs left in the code as follow ups 
(after getting more clarity, and some more support from Tez). Fix one known 
problem for now. Create new jiras to fix others.

> queries running against llap hang at times - preemption issues
> --
>
> Key: HIVE-14003
> URL: https://issues.apache.org/jira/browse/HIVE-14003
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Assignee: Siddharth Seth
> Attachments: HIVE-14003.01.patch
>
>
> The preemption logic in the Hive processor needs some more work. There are 
> definitely windows where the abort flag is completely dropped within the Hive 
> processor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13258) LLAP: Add hdfs bytes read and spilled bytes to tez print summary

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13258:
-
Status: Patch Available  (was: Open)

> LLAP: Add hdfs bytes read and spilled bytes to tez print summary
> 
>
> Key: HIVE-13258
> URL: https://issues.apache.org/jira/browse/HIVE-13258
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13258.1.patch, HIVE-13258.1.patch, 
> HIVE-13258.2.patch, HIVE-13258.3.patch, HIVE-13258.4.patch, 
> llap-fs-counters-full-cache-hit.png, llap-fs-counters.png
>
>
> When printing counters to console it will be useful to print hdfs bytes read 
> and spilled bytes which will help with debugging issues faster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13258) LLAP: Add hdfs bytes read and spilled bytes to tez print summary

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13258:
-
Attachment: HIVE-13258.4.patch

Addressed [~sseth]'s review comments. Also using 0.8.4-SNAPSHOT tez version for 
precommit test run.

> LLAP: Add hdfs bytes read and spilled bytes to tez print summary
> 
>
> Key: HIVE-13258
> URL: https://issues.apache.org/jira/browse/HIVE-13258
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13258.1.patch, HIVE-13258.1.patch, 
> HIVE-13258.2.patch, HIVE-13258.3.patch, HIVE-13258.4.patch, 
> llap-fs-counters-full-cache-hit.png, llap-fs-counters.png
>
>
> When printing counters to console it will be useful to print hdfs bytes read 
> and spilled bytes which will help with debugging issues faster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13970) refactor LLAPIF splits - get rid of SubmitWorkInfo

2016-06-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335035#comment-15335035
 ] 

Siddharth Seth commented on HIVE-13970:
---

https://issues.apache.org/jira/browse/HIVE-13915 was created to get rid of the 
duplication of information.

> refactor LLAPIF splits - get rid of SubmitWorkInfo
> --
>
> Key: HIVE-13970
> URL: https://issues.apache.org/jira/browse/HIVE-13970
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13970.01.patch, HIVE-13970.only.patch, 
> HIVE-13970.patch, HIVE-13970.patch
>
>
> First we build the signable vertex spec, convert it into bytes (as we 
> should), and put it inside SubmitWorkInfo. Then we serialize that into byte[] 
> and put it into LlapInputSplit. Then we serialize that to return... We should 
> get rid of one of the steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13970) refactor LLAPIF splits - get rid of SubmitWorkInfo

2016-06-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335030#comment-15335030
 ] 

Siddharth Seth commented on HIVE-13970:
---

Don't understand the relation - how does protobuf / writeable have anything to 
do with this.

I think the protocol used to communicate with the client from 
GenericUDTFGetSplits needs to change a little. The part where it sends back the 
splits should be structured differently. Is that what you're referring to ?

> refactor LLAPIF splits - get rid of SubmitWorkInfo
> --
>
> Key: HIVE-13970
> URL: https://issues.apache.org/jira/browse/HIVE-13970
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13970.01.patch, HIVE-13970.only.patch, 
> HIVE-13970.patch, HIVE-13970.patch
>
>
> First we build the signable vertex spec, convert it into bytes (as we 
> should), and put it inside SubmitWorkInfo. Then we serialize that into byte[] 
> and put it into LlapInputSplit. Then we serialize that to return... We should 
> get rid of one of the steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13970) refactor LLAPIF splits - get rid of SubmitWorkInfo

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335015#comment-15335015
 ] 

Sergey Shelukhin commented on HIVE-13970:
-

Hmm... in that case it should be protobuf, similar to Vertex spec except not 
signed. I'll rework the patch in due course...

> refactor LLAPIF splits - get rid of SubmitWorkInfo
> --
>
> Key: HIVE-13970
> URL: https://issues.apache.org/jira/browse/HIVE-13970
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13970.01.patch, HIVE-13970.only.patch, 
> HIVE-13970.patch, HIVE-13970.patch
>
>
> First we build the signable vertex spec, convert it into bytes (as we 
> should), and put it inside SubmitWorkInfo. Then we serialize that into byte[] 
> and put it into LlapInputSplit. Then we serialize that to return... We should 
> get rid of one of the steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-06-16 Thread Rahul Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Sharma reassigned HIVE-13966:
---

Assignee: Rahul Sharma

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13970) refactor LLAPIF splits - get rid of SubmitWorkInfo

2016-06-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335011#comment-15335011
 ] 

Siddharth Seth commented on HIVE-13970:
---

[~sershe] - SubmitWorkInfo currently contains information which is common to 
all InputSplits. It's created only once, and should be sent over the network 
only once.
The sending over the network part does not happen at the moment (there's a jira 
open to change that).
I don't think we should make this change, since it'll make it difficult to 
avoid sending the same information along with each and every split. (The size 
of the vertexSpec can get quite large)

> refactor LLAPIF splits - get rid of SubmitWorkInfo
> --
>
> Key: HIVE-13970
> URL: https://issues.apache.org/jira/browse/HIVE-13970
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13970.01.patch, HIVE-13970.only.patch, 
> HIVE-13970.patch, HIVE-13970.patch
>
>
> First we build the signable vertex spec, convert it into bytes (as we 
> should), and put it inside SubmitWorkInfo. Then we serialize that into byte[] 
> and put it into LlapInputSplit. Then we serialize that to return... We should 
> get rid of one of the steps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14018) Make IN clause row selectivity estimation customizable

2016-06-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335000#comment-15335000
 ] 

Hive QA commented on HIVE-14018:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12811059/HIVE-14018.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10233 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_repair
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_table_nonprintable
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testTableCheck
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/142/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/142/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-142/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12811059 - PreCommit-HIVE-MASTER-Build

> Make IN clause row selectivity estimation customizable
> --
>
> Key: HIVE-14018
> URL: https://issues.apache.org/jira/browse/HIVE-14018
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14018.1.patch, HIVE-14018.patch
>
>
> After HIVE-13287 went in, we calculate IN clause estimates natively (instead 
> of just dividing incoming number of rows by 2). However, as the distribution 
> of values of the columns is considered uniform, we might end up heavily 
> underestimating/overestimating the resulting number of rows.
> This issue is to add a factor that multiplies the IN clause estimation so we 
> can alleviate this problem. The solution is not very elegant, but it is the 
> best we can do until we have histograms to improve our estimate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14026) data can not be retrieved

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334988#comment-15334988
 ] 

Sergey Shelukhin commented on HIVE-14026:
-

This particular case seems to be by design for HBase because all rows have the 
same key.

> data can not be retrieved
> -
>
> Key: HIVE-14026
> URL: https://issues.apache.org/jira/browse/HIVE-14026
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>
> {code}
> DROP TABLE users;
> CREATE TABLE users(key string, state string, country string, country_id int)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "info:state,info:country,info:country_id"
> );
> INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src;
> select * from users;
> {code}
> The result is only one row:
> {code}
> user1   IA  USA 0
> {code}
> should be 500 rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14003) queries running against llap hang at times - preemption issues

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334987#comment-15334987
 ] 

Sergey Shelukhin commented on HIVE-14003:
-

looks good to me with comments addressed...

> queries running against llap hang at times - preemption issues
> --
>
> Key: HIVE-14003
> URL: https://issues.apache.org/jira/browse/HIVE-14003
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Assignee: Siddharth Seth
> Attachments: HIVE-14003.01.patch
>
>
> The preemption logic in the Hive processor needs some more work. There are 
> definitely windows where the abort flag is completely dropped within the Hive 
> processor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13833) Add an initial delay when starting the heartbeat

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13833:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> Add an initial delay when starting the heartbeat
> 
>
> Key: HIVE-13833
> URL: https://issues.apache.org/jira/browse/HIVE-13833
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Minor
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13833.1.patch, HIVE-13833.2.patch, 
> HIVE-13833.3.patch, HIVE-13833.4.patch
>
>
> Since the scheduling of heartbeat happens immediately after lock acquisition, 
> it's unnecessary to send heartbeat at the time when locks is acquired. Add an 
> initial delay to skip this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13974) ORC Schema Evolution doesn't support add columns to non-last STRUCT columns

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13974:
---
Fix Version/s: (was: 2.1.0)

> ORC Schema Evolution doesn't support add columns to non-last STRUCT columns
> ---
>
> Key: HIVE-13974
> URL: https://issues.apache.org/jira/browse/HIVE-13974
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC, Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Attachments: HIVE-13974.01.patch, HIVE-13974.02.patch, 
> HIVE-13974.03.patch, HIVE-13974.04.patch
>
>
> Currently, the included columns are based on the fileSchema and not the 
> readerSchema which doesn't work for adding columns to non-last STRUCT data 
> type columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13961) ACID: Major compaction fails to include the original bucket files if there's no delta directory

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13961:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> ACID: Major compaction fails to include the original bucket files if there's 
> no delta directory
> ---
>
> Key: HIVE-13961
> URL: https://issues.apache.org/jira/browse/HIVE-13961
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Blocker
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13961.1.patch, HIVE-13961.2.patch, 
> HIVE-13961.3.patch, HIVE-13961.4.patch, HIVE-13961.5.patch, HIVE-13961.6.patch
>
>
> The issue can be reproduced by steps below:
> 1. Insert a row to Non-ACID table
> 2. Convert Non-ACID to ACID table (i.e. set transactional=true table property)
> 3. Perform Major compaction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13957) vectorized IN is inconsistent with non-vectorized (at least for decimal in (string))

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13957:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> vectorized IN is inconsistent with non-vectorized (at least for decimal in 
> (string))
> 
>
> Key: HIVE-13957
> URL: https://issues.apache.org/jira/browse/HIVE-13957
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.1.0, 2.0.2
>
> Attachments: HIVE-13957.01.patch, HIVE-13957.02.patch, 
> HIVE-13957.03.patch, HIVE-13957.patch, HIVE-13957.patch
>
>
> The cast is applied to the column in regular IN, but vectorized IN applies it 
> to the IN() list.
> This can cause queries to produce incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13959) MoveTask should only release its query associated locks

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13959:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> MoveTask should only release its query associated locks
> ---
>
> Key: HIVE-13959
> URL: https://issues.apache.org/jira/browse/HIVE-13959
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0
>
> Attachments: HIVE-13959.1.patch, HIVE-13959.patch, HIVE-13959.patch
>
>
> releaseLocks in MoveTask releases all locks under a HiveLockObject pathNames. 
> But some of locks under this pathNames might be for other queries and should 
> not be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13984) Use multi-threaded approach to listing files for msck

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13984:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> Use multi-threaded approach to listing files for msck
> -
>
> Key: HIVE-13984
> URL: https://issues.apache.org/jira/browse/HIVE-13984
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13984.01.patch, HIVE-13984.02.patch, 
> HIVE-13984.03.patch, HIVE-13984.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13903) getFunctionInfo is downloading jar on every call

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13903:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> getFunctionInfo is downloading jar on every call
> 
>
> Key: HIVE-13903
> URL: https://issues.apache.org/jira/browse/HIVE-13903
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13903.01.patch, HIVE-13903.01.patch, 
> HIVE-13903.02.patch
>
>
> on queries using permanent udfs, the jar file of the udf is downloaded 
> multiple times. Each call originating from Registry.getFunctionInfo. This 
> increases time for the query, especially if that query is just an explain 
> query. The jar should be downloaded once, and not downloaded again if the udf 
> class is accessible in the current thread. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14006) Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14006:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException
> ---
>
> Key: HIVE-14006
> URL: https://issues.apache.org/jira/browse/HIVE-14006
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.1.0
>
> Attachments: HIVE-14006.1.patch, HIVE-14006.patch
>
>
> set hive.cbo.enable=false;
> DROP VIEW IF EXISTS a_view;
> DROP TABLE IF EXISTS table_a1;
> DROP TABLE IF EXISTS table_a2;
> DROP TABLE IF EXISTS table_b1;
> DROP TABLE IF EXISTS table_b2;
> CREATE TABLE table_a1
> (composite_key STRING);
> CREATE TABLE table_a2
> (composite_key STRING);
> CREATE TABLE table_b1
> (composite_key STRING, col1 STRING);
> CREATE TABLE table_b2
> (composite_key STRING);
> CREATE VIEW a_view AS
> SELECT
> substring(a1.composite_key, 1, locate('|',a1.composite_key) - 1) AS autoname,
> NULL AS col1
> FROM table_a1 a1
> FULL OUTER JOIN table_a2 a2
> ON a1.composite_key = a2.composite_key
> UNION ALL
> SELECT
> substring(b1.composite_key, 1, locate('|',b1.composite_key) - 1) AS autoname,
> b1.col1 AS col1
> FROM table_b1 b1
> FULL OUTER JOIN table_b2 b2
> ON b1.composite_key = b2.composite_key;
> INSERT INTO TABLE table_b1
> SELECT * FROM (
> SELECT 'something|awful', 'col1'
> )s ;
> SELECT autoname
> FROM a_view
> WHERE autoname='something';
> fails with 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"_col0":"something"}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"_col0":"something"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   ... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:134)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> The same query succeeds when {{hive.ppd.remove.duplicatefilters=false}} with 
> or without CBO on. It also succeeds with just CBO on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14020) Hive MS restart failed during EU with ORA-00922 error as part of DB schema upgrade

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14020:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> Hive MS restart failed during EU with ORA-00922 error as part of DB schema 
> upgrade
> --
>
> Key: HIVE-14020
> URL: https://issues.apache.org/jira/browse/HIVE-14020
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: HIVE-14020.1.patch
>
>
> NO PRECOMMIT TESTS
> The underlying failure seems to be visible from --verbose : 
> {noformat}
> Metastore connection URL:jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> Metastore Connection Driver :oracle.jdbc.driver.OracleDriver
> Metastore connection User:   hiveuser
> Starting upgrade metastore schema from version 2.0.0 to 2.1.0
> Upgrade script upgrade-2.0.0-to-2.1.0.oracle.sql
> Connecting to jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> Connected to: Oracle (version Oracle Database 11g Express Edition Release 
> 11.2.0.2.0 - 64bit Production)
> Driver: Oracle JDBC driver (version 11.2.0.4.0)
> Transaction isolation: TRANSACTION_READ_COMMITTED
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> !autocommit on
> Autocommit status: true
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> SELECT 'Upgrading MetaStore 
> schema from 2.0.0 to 2.1.0' AS Status from dual
> +-+--+
> | STATUS  |
> +-+--+
> | Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
> +-+--+
> 1 row selected (0.072 seconds)
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> CREATE TABLE IF NOT EXISTS  
> KEY_CONSTRAINTS ( CHILD_CD_ID NUMBER, CHILD_INTEGER_IDX NUMBER, CHILD_TBL_ID 
> NUMBER, PARENT_CD_ID NUMBER NOT NULL, PARENT_INTEGER_IDX ^M NUMBER NOT NULL, 
> PARENT_TBL_ID NUMBER NOT NULL, POSITION NUMBER NOT NULL, CONSTRAINT_NAME 
> VARCHAR(400) NOT NULL, CONSTRAINT_TYPE NUMBER NOT NULL, UPDATE_RULE NUMBER, 
> DELETE_RULE NUMBER, ENABLE_VALIDATE_REL ^MY NUMBER NOT NULL ) 
> Error: ORA-00922: missing or invalid option (state=42000,code=922)
> Closing: 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
> state would be inconsistent !!
> Underlying cause: java.io.IOException : Schema script failed, errorcode 2
> org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
> state would be inconsistent !!
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:250)
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:218)
> at 
> org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:500)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: Schema script failed, errorcode 2
> at 
> org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:390)
> at 
> org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:347)
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:245)
> ... 8 more
> *** schemaTool failed ***
> {noformat}
> At the face of it, it looks like issue from the actual script ( 
> 034-HIVE-13076.oracle.sql ) that's provided:
> {noformat}
> CREATE TABLE IF NOT EXISTS  KEY_CONSTRAINTS
> (
>   CHILD_CD_ID NUMBER,
>   CHILD_INTEGER_IDX NUMBER,
>   CHILD_TBL_ID NUMBER,
>   PARENT_CD_ID NUMBER NOT NULL,
>   PARENT_INTEGER_IDX NUMBER NOT NULL,
>   PARENT_TBL_ID NUMBER NOT NULL,
>   POSITION NUMBER NOT NULL,
>   CONSTRAINT_NAME VARCHAR(400) NOT NULL,
>   CONSTRAINT_TYPE NUMBER NOT NULL,
>   UPDATE_RULE NUMBER,
>   DELETE_RULE NUMBER,
>   ENABLE_VALIDATE_RELY NUMBER NOT NULL
> ) ;
> ALTER TABLE KEY_CONSTRAINTS ADD CONSTRAINT CONSTRAINTS_PK PRIMARY KEY 
> (CONSTRAINT_NAME, POSITION);
> CREATE INDEX CONSTRAINTS_PT_INDEX ON KEY_CONSTRAINTS(PARENT_TBL_ID);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14008) Duplicate line in LLAP SecretManager

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14008:
---
Fix Version/s: (was: 2.1.1)
   (was: 2.2.0)
   2.1.0

> Duplicate line in LLAP SecretManager
> 
>
> Key: HIVE-14008
> URL: https://issues.apache.org/jira/browse/HIVE-14008
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Fix For: 2.1.0
>
> Attachments: HIVE-14008.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Target Version/s: 1.3.0, 2.2.0  (was: 2.2.0)

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch, 
> HIVE-13985.3.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Affects Version/s: 1.3.0

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch, 
> HIVE-13985.3.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Status: Patch Available  (was: Open)

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch, 
> HIVE-13985.3.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Status: Open  (was: Patch Available)

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch, 
> HIVE-13985.3.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Attachment: HIVE-13985.3.patch

Rebased and added more tests for master branch

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch, 
> HIVE-13985.3.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14031) cleanup metadataReader in OrcEncodedDataReader

2016-06-16 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334947#comment-15334947
 ] 

Rajesh Balamohan edited comment on HIVE-14031 at 6/16/16 11:24 PM:
---

Thanks [~prasanth_j], [~sershe]. Submitting the patch for jenkins job.


was (Author: rajesh.balamohan):
Thanks [~prasanth_j]. Submitting the patch for jenkins job.

> cleanup metadataReader in OrcEncodedDataReader
> --
>
> Key: HIVE-14031
> URL: https://issues.apache.org/jira/browse/HIVE-14031
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14031.1.patch
>
>
> MetadataReader should be closed in OrcEncodedDataReader as a part of 
> cleanupReaders. 
> \cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14031) cleanup metadataReader in OrcEncodedDataReader

2016-06-16 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14031:

Status: Patch Available  (was: Open)

Thanks [~prasanth_j]. Submitting the patch for jenkins job.

> cleanup metadataReader in OrcEncodedDataReader
> --
>
> Key: HIVE-14031
> URL: https://issues.apache.org/jira/browse/HIVE-14031
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14031.1.patch
>
>
> MetadataReader should be closed in OrcEncodedDataReader as a part of 
> cleanupReaders. 
> \cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14031) cleanup metadataReader in OrcEncodedDataReader

2016-06-16 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334943#comment-15334943
 ] 

Prasanth Jayachandran commented on HIVE-14031:
--

orcReader is instanceof Reader which closes the file internally. The attached 
patch looks good to me, +1. Pending tests

> cleanup metadataReader in OrcEncodedDataReader
> --
>
> Key: HIVE-14031
> URL: https://issues.apache.org/jira/browse/HIVE-14031
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14031.1.patch
>
>
> MetadataReader should be closed in OrcEncodedDataReader as a part of 
> cleanupReaders. 
> \cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11089) Hive Streaming: connection fails when using a proxy user UGI

2016-06-16 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng resolved HIVE-11089.
--
Resolution: Invalid

> Hive Streaming: connection fails when using a proxy user UGI
> 
>
> Key: HIVE-11089
> URL: https://issues.apache.org/jira/browse/HIVE-11089
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Adam Kunicki
>Assignee: Wei Zheng
>  Labels: ACID, Streaming
>
> HIVE-7508 "Add Kerberos Support" seems to also remove the ability to specify 
> a proxy user.
> HIVE-8427 adds a call to ugi.hasKerberosCredentials() to check whether the 
> connection is supposed to be a secure connection.
> This however breaks support for Proxy Users as a proxy user UGI will always 
> return false to hasKerberosCredentials().
> See lines 273, 274 of HiveEndPoint.java
> {code}
> this.secureMode = ugi==null ? false : ugi.hasKerberosCredentials();
> this.msClient = getMetaStoreClient(endPoint, conf, secureMode);
> {code}
> It also seems that between 13.1 and 0.14 the newConnection() method that 
> includes a proxy user has been removed.
> for reference: 
> https://github.com/apache/hive/commit/8e423a12db47759196c24535fbc32236b79f464a



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11089) Hive Streaming: connection fails when using a proxy user UGI

2016-06-16 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334922#comment-15334922
 ] 

Wei Zheng commented on HIVE-11089:
--

Had discussion with [~roshan_naik]. Looks like we didn't have proxy support for 
Hive streaming from day 1. As Roshan pointed out earlier, the "proxyuser" param 
was in a private method and was always null. The plan was to support proxy and 
that didn't happen (so the wiki was also updated).

If proxying is needed, this will be a new feature request. So I will close this 
ticket as invalid.

> Hive Streaming: connection fails when using a proxy user UGI
> 
>
> Key: HIVE-11089
> URL: https://issues.apache.org/jira/browse/HIVE-11089
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
>Reporter: Adam Kunicki
>Assignee: Wei Zheng
>  Labels: ACID, Streaming
>
> HIVE-7508 "Add Kerberos Support" seems to also remove the ability to specify 
> a proxy user.
> HIVE-8427 adds a call to ugi.hasKerberosCredentials() to check whether the 
> connection is supposed to be a secure connection.
> This however breaks support for Proxy Users as a proxy user UGI will always 
> return false to hasKerberosCredentials().
> See lines 273, 274 of HiveEndPoint.java
> {code}
> this.secureMode = ugi==null ? false : ugi.hasKerberosCredentials();
> this.msClient = getMetaStoreClient(endPoint, conf, secureMode);
> {code}
> It also seems that between 13.1 and 0.14 the newConnection() method that 
> includes a proxy user has been removed.
> for reference: 
> https://github.com/apache/hive/commit/8e423a12db47759196c24535fbc32236b79f464a



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-06-16 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-13723:
---

Assignee: Vaibhav Gumashta  (was: Ziyang Zhao)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Vaibhav Gumashta
>Priority: Critical
> Attachments: HIVE-13723.1.patch, HIVE-13723.2.patch
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-06-16 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334851#comment-15334851
 ] 

Vaibhav Gumashta commented on HIVE-13723:
-

Submitted another time for QA run.

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.1.patch, HIVE-13723.2.patch
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-06-16 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13723:

Assignee: Ziyang Zhao  (was: Vaibhav Gumashta)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.1.patch, HIVE-13723.2.patch
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-06-16 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13723:

Status: Open  (was: Patch Available)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Ziyang Zhao
>Priority: Critical
> Attachments: HIVE-13723.1.patch, HIVE-13723.2.patch
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
>  

[jira] [Updated] (HIVE-13723) Executing join query on type Float using Thrift Serde will result in Float cast to Double error

2016-06-16 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13723:

Status: Patch Available  (was: Open)

> Executing join query on type Float using Thrift Serde will result in Float 
> cast to Double error
> ---
>
> Key: HIVE-13723
> URL: https://issues.apache.org/jira/browse/HIVE-13723
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, Serializers/Deserializers
>Affects Versions: 2.1.0
>Reporter: Ziyang Zhao
>Assignee: Vaibhav Gumashta
>Priority: Critical
> Attachments: HIVE-13723.1.patch, HIVE-13723.2.patch
>
>
> After enable thrift Serde, execute the following queries in beeline,
> >create table test1 (a int);
> >create table test2 (b float);
> >insert into test1 values (1);
> >insert into test2 values (1);
> >select * from test1 join test2 on test1.a=test2.b;
> this will give the error:
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:168) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"b":1.0}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:568) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:159) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-2.7.1.2.4.0.0-169.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>  ~[hadoop-mapreduce-client-common-2.7.1.2.4.0.0-169.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[?:1.7.0_95]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[?:1.7.0_95]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  ~[?:1.7.0_95]
> at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_95]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.ClassCastException: 
> java.lang.Float cannot be cast to java.lang.Double
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) 
> ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)

[jira] [Updated] (HIVE-14039) HiveServer2: Make the usage of server with JDBC thirft serde enabled, backward compatible for older clients

2016-06-16 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14039:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-12427

> HiveServer2: Make the usage of server with JDBC thirft serde enabled, 
> backward compatible for older clients
> ---
>
> Key: HIVE-14039
> URL: https://issues.apache.org/jira/browse/HIVE-14039
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.1
>Reporter: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13788) hive msck listpartitions need to make use of directSQL instead of datanucleus

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13788:
---
Fix Version/s: (was: 2.1.1)

> hive msck listpartitions need to make use of directSQL instead of datanucleus
> -
>
> Key: HIVE-13788
> URL: https://issues.apache.org/jira/browse/HIVE-13788
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-13788.1.patch, HIVE-13788.2.patch, 
> msck_call_stack_with_fix.png, msck_stack_trace.png
>
>
> Currently, for tables having 1000s of partitions too many DB calls are made 
> via datanucleus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13833) Add an initial delay when starting the heartbeat

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13833:
---
Fix Version/s: 1.3.0

> Add an initial delay when starting the heartbeat
> 
>
> Key: HIVE-13833
> URL: https://issues.apache.org/jira/browse/HIVE-13833
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Minor
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-13833.1.patch, HIVE-13833.2.patch, 
> HIVE-13833.3.patch, HIVE-13833.4.patch
>
>
> Since the scheduling of heartbeat happens immediately after lock acquisition, 
> it's unnecessary to send heartbeat at the time when locks is acquired. Add an 
> initial delay to skip this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13957) vectorized IN is inconsistent with non-vectorized (at least for decimal in (string))

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13957:
---
Fix Version/s: 2.0.2
   1.3.0

> vectorized IN is inconsistent with non-vectorized (at least for decimal in 
> (string))
> 
>
> Key: HIVE-13957
> URL: https://issues.apache.org/jira/browse/HIVE-13957
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.2.0, 2.1.1, 2.0.2
>
> Attachments: HIVE-13957.01.patch, HIVE-13957.02.patch, 
> HIVE-13957.03.patch, HIVE-13957.patch, HIVE-13957.patch
>
>
> The cast is applied to the column in regular IN, but vectorized IN applies it 
> to the IN() list.
> This can cause queries to produce incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13961) ACID: Major compaction fails to include the original bucket files if there's no delta directory

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13961:
---
Fix Version/s: 1.3.0

> ACID: Major compaction fails to include the original bucket files if there's 
> no delta directory
> ---
>
> Key: HIVE-13961
> URL: https://issues.apache.org/jira/browse/HIVE-13961
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Blocker
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-13961.1.patch, HIVE-13961.2.patch, 
> HIVE-13961.3.patch, HIVE-13961.4.patch, HIVE-13961.5.patch, HIVE-13961.6.patch
>
>
> The issue can be reproduced by steps below:
> 1. Insert a row to Non-ACID table
> 2. Convert Non-ACID to ACID table (i.e. set transactional=true table property)
> 3. Perform Major compaction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14008) Duplicate line in LLAP SecretManager

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14008:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> Duplicate line in LLAP SecretManager
> 
>
> Key: HIVE-14008
> URL: https://issues.apache.org/jira/browse/HIVE-14008
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Trivial
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14008.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13788) hive msck listpartitions need to make use of directSQL instead of datanucleus

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13788:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> hive msck listpartitions need to make use of directSQL instead of datanucleus
> -
>
> Key: HIVE-13788
> URL: https://issues.apache.org/jira/browse/HIVE-13788
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Minor
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13788.1.patch, HIVE-13788.2.patch, 
> msck_call_stack_with_fix.png, msck_stack_trace.png
>
>
> Currently, for tables having 1000s of partitions too many DB calls are made 
> via datanucleus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13957) vectorized IN is inconsistent with non-vectorized (at least for decimal in (string))

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13957:
---
Fix Version/s: (was: 2.0.2)
   (was: 2.1.0)
   (was: 1.3.0)
   2.1.1
   2.2.0

> vectorized IN is inconsistent with non-vectorized (at least for decimal in 
> (string))
> 
>
> Key: HIVE-13957
> URL: https://issues.apache.org/jira/browse/HIVE-13957
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13957.01.patch, HIVE-13957.02.patch, 
> HIVE-13957.03.patch, HIVE-13957.patch, HIVE-13957.patch
>
>
> The cast is applied to the column in regular IN, but vectorized IN applies it 
> to the IN() list.
> This can cause queries to produce incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14020) Hive MS restart failed during EU with ORA-00922 error as part of DB schema upgrade

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14020:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> Hive MS restart failed during EU with ORA-00922 error as part of DB schema 
> upgrade
> --
>
> Key: HIVE-14020
> URL: https://issues.apache.org/jira/browse/HIVE-14020
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14020.1.patch
>
>
> NO PRECOMMIT TESTS
> The underlying failure seems to be visible from --verbose : 
> {noformat}
> Metastore connection URL:jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> Metastore Connection Driver :oracle.jdbc.driver.OracleDriver
> Metastore connection User:   hiveuser
> Starting upgrade metastore schema from version 2.0.0 to 2.1.0
> Upgrade script upgrade-2.0.0-to-2.1.0.oracle.sql
> Connecting to jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> Connected to: Oracle (version Oracle Database 11g Express Edition Release 
> 11.2.0.2.0 - 64bit Production)
> Driver: Oracle JDBC driver (version 11.2.0.4.0)
> Transaction isolation: TRANSACTION_READ_COMMITTED
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> !autocommit on
> Autocommit status: true
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> SELECT 'Upgrading MetaStore 
> schema from 2.0.0 to 2.1.0' AS Status from dual
> +-+--+
> | STATUS  |
> +-+--+
> | Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
> +-+--+
> 1 row selected (0.072 seconds)
> 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE> CREATE TABLE IF NOT EXISTS  
> KEY_CONSTRAINTS ( CHILD_CD_ID NUMBER, CHILD_INTEGER_IDX NUMBER, CHILD_TBL_ID 
> NUMBER, PARENT_CD_ID NUMBER NOT NULL, PARENT_INTEGER_IDX ^M NUMBER NOT NULL, 
> PARENT_TBL_ID NUMBER NOT NULL, POSITION NUMBER NOT NULL, CONSTRAINT_NAME 
> VARCHAR(400) NOT NULL, CONSTRAINT_TYPE NUMBER NOT NULL, UPDATE_RULE NUMBER, 
> DELETE_RULE NUMBER, ENABLE_VALIDATE_REL ^MY NUMBER NOT NULL ) 
> Error: ORA-00922: missing or invalid option (state=42000,code=922)
> Closing: 0: jdbc:oracle:thin:@//aaa:bb:cc:dd:1521/XE
> org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
> state would be inconsistent !!
> Underlying cause: java.io.IOException : Schema script failed, errorcode 2
> org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
> state would be inconsistent !!
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:250)
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:218)
> at 
> org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:500)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: Schema script failed, errorcode 2
> at 
> org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:390)
> at 
> org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:347)
> at 
> org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:245)
> ... 8 more
> *** schemaTool failed ***
> {noformat}
> At the face of it, it looks like issue from the actual script ( 
> 034-HIVE-13076.oracle.sql ) that's provided:
> {noformat}
> CREATE TABLE IF NOT EXISTS  KEY_CONSTRAINTS
> (
>   CHILD_CD_ID NUMBER,
>   CHILD_INTEGER_IDX NUMBER,
>   CHILD_TBL_ID NUMBER,
>   PARENT_CD_ID NUMBER NOT NULL,
>   PARENT_INTEGER_IDX NUMBER NOT NULL,
>   PARENT_TBL_ID NUMBER NOT NULL,
>   POSITION NUMBER NOT NULL,
>   CONSTRAINT_NAME VARCHAR(400) NOT NULL,
>   CONSTRAINT_TYPE NUMBER NOT NULL,
>   UPDATE_RULE NUMBER,
>   DELETE_RULE NUMBER,
>   ENABLE_VALIDATE_RELY NUMBER NOT NULL
> ) ;
> ALTER TABLE KEY_CONSTRAINTS ADD CONSTRAINT CONSTRAINTS_PK PRIMARY KEY 
> (CONSTRAINT_NAME, POSITION);
> CREATE INDEX CONSTRAINTS_PT_INDEX ON KEY_CONSTRAINTS(PARENT_TBL_ID);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13959) MoveTask should only release its query associated locks

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13959:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> MoveTask should only release its query associated locks
> ---
>
> Key: HIVE-13959
> URL: https://issues.apache.org/jira/browse/HIVE-13959
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13959.1.patch, HIVE-13959.patch, HIVE-13959.patch
>
>
> releaseLocks in MoveTask releases all locks under a HiveLockObject pathNames. 
> But some of locks under this pathNames might be for other queries and should 
> not be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13984) Use multi-threaded approach to listing files for msck

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13984:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> Use multi-threaded approach to listing files for msck
> -
>
> Key: HIVE-13984
> URL: https://issues.apache.org/jira/browse/HIVE-13984
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13984.01.patch, HIVE-13984.02.patch, 
> HIVE-13984.03.patch, HIVE-13984.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13903) getFunctionInfo is downloading jar on every call

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13903:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> getFunctionInfo is downloading jar on every call
> 
>
> Key: HIVE-13903
> URL: https://issues.apache.org/jira/browse/HIVE-13903
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13903.01.patch, HIVE-13903.01.patch, 
> HIVE-13903.02.patch
>
>
> on queries using permanent udfs, the jar file of the udf is downloaded 
> multiple times. Each call originating from Registry.getFunctionInfo. This 
> increases time for the query, especially if that query is just an explain 
> query. The jar should be downloaded once, and not downloaded again if the udf 
> class is accessible in the current thread. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13833) Add an initial delay when starting the heartbeat

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13833:
---
Fix Version/s: (was: 2.1.0)
   (was: 1.3.0)
   2.1.1
   2.2.0

> Add an initial delay when starting the heartbeat
> 
>
> Key: HIVE-13833
> URL: https://issues.apache.org/jira/browse/HIVE-13833
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Minor
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13833.1.patch, HIVE-13833.2.patch, 
> HIVE-13833.3.patch, HIVE-13833.4.patch
>
>
> Since the scheduling of heartbeat happens immediately after lock acquisition, 
> it's unnecessary to send heartbeat at the time when locks is acquired. Add an 
> initial delay to skip this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13961) ACID: Major compaction fails to include the original bucket files if there's no delta directory

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13961:
---
Fix Version/s: (was: 2.1.0)
   (was: 1.3.0)
   2.1.1
   2.2.0

> ACID: Major compaction fails to include the original bucket files if there's 
> no delta directory
> ---
>
> Key: HIVE-13961
> URL: https://issues.apache.org/jira/browse/HIVE-13961
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0, 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Blocker
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13961.1.patch, HIVE-13961.2.patch, 
> HIVE-13961.3.patch, HIVE-13961.4.patch, HIVE-13961.5.patch, HIVE-13961.6.patch
>
>
> The issue can be reproduced by steps below:
> 1. Insert a row to Non-ACID table
> 2. Convert Non-ACID to ACID table (i.e. set transactional=true table property)
> 3. Perform Major compaction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14006) Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14006:
---
Fix Version/s: (was: 2.1.0)
   2.1.1
   2.2.0

> Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException
> ---
>
> Key: HIVE-14006
> URL: https://issues.apache.org/jira/browse/HIVE-14006
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14006.1.patch, HIVE-14006.patch
>
>
> set hive.cbo.enable=false;
> DROP VIEW IF EXISTS a_view;
> DROP TABLE IF EXISTS table_a1;
> DROP TABLE IF EXISTS table_a2;
> DROP TABLE IF EXISTS table_b1;
> DROP TABLE IF EXISTS table_b2;
> CREATE TABLE table_a1
> (composite_key STRING);
> CREATE TABLE table_a2
> (composite_key STRING);
> CREATE TABLE table_b1
> (composite_key STRING, col1 STRING);
> CREATE TABLE table_b2
> (composite_key STRING);
> CREATE VIEW a_view AS
> SELECT
> substring(a1.composite_key, 1, locate('|',a1.composite_key) - 1) AS autoname,
> NULL AS col1
> FROM table_a1 a1
> FULL OUTER JOIN table_a2 a2
> ON a1.composite_key = a2.composite_key
> UNION ALL
> SELECT
> substring(b1.composite_key, 1, locate('|',b1.composite_key) - 1) AS autoname,
> b1.col1 AS col1
> FROM table_b1 b1
> FULL OUTER JOIN table_b2 b2
> ON b1.composite_key = b2.composite_key;
> INSERT INTO TABLE table_b1
> SELECT * FROM (
> SELECT 'something|awful', 'col1'
> )s ;
> SELECT autoname
> FROM a_view
> WHERE autoname='something';
> fails with 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"_col0":"something"}
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"_col0":"something"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>   ... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:134)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> The same query succeeds when {{hive.ppd.remove.duplicatefilters=false}} with 
> or without CBO on. It also succeeds with just CBO on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14022) left semi join throws SemanticException if where clause contains columnname with table alias

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334766#comment-15334766
 ] 

Jesus Camacho Rodriguez commented on HIVE-14022:


Fails are unrelated. [~ashutoshc], could you review it? Thanks

> left semi join throws SemanticException if where clause contains columnname 
> with table alias
> 
>
> Key: HIVE-14022
> URL: https://issues.apache.org/jira/browse/HIVE-14022
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14022.1.patch, HIVE-14022.patch
>
>
> Left semi join throws following error if where clause contains column name 
> with table alias
> {noformat}
> select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016;
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO ql.Driver: We are setting the hadoop caller 
> context from  to hrt_qa_20160610223737_c3821398-d8df-44d8-9dd5-e66c9b7ed7c7
> 16/06/10 22:37:37 [main]: DEBUG parse.VariableSubstitution: Substitution is 
> on: select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parsing command: select * 
> from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parse Completed
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  start=1465598257393 end=1465598257397 duration=4 
> from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: DEBUG ql.Driver: Encoding valid txns info 
> 9223372036854775807:
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Starting Semantic 
> Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed phase 1 of 
> Semantic Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for source 
> tables
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> subqueries
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> destination tables
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #194
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #194
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #195
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #195
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 1ms
> 16/06/10 22:37:37 [main]: DEBUG hdfs.DFSClient: 
> /tmp/hive/hrt_qa/d2568b75-6399-46df-82b9-34ec445e8f64/hive_2016-06-10_22-37-37_392_2780828105665881901-1:
>  masked=rwx--
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #196
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #196
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: mkdirs took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa 

[jira] [Updated] (HIVE-14022) left semi join throws SemanticException if where clause contains columnname with table alias

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14022:
---
Fix Version/s: (was: 2.2.0)

> left semi join throws SemanticException if where clause contains columnname 
> with table alias
> 
>
> Key: HIVE-14022
> URL: https://issues.apache.org/jira/browse/HIVE-14022
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14022.1.patch, HIVE-14022.patch
>
>
> Left semi join throws following error if where clause contains column name 
> with table alias
> {noformat}
> select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016;
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO ql.Driver: We are setting the hadoop caller 
> context from  to hrt_qa_20160610223737_c3821398-d8df-44d8-9dd5-e66c9b7ed7c7
> 16/06/10 22:37:37 [main]: DEBUG parse.VariableSubstitution: Substitution is 
> on: select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parsing command: select * 
> from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parse Completed
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  start=1465598257393 end=1465598257397 duration=4 
> from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: DEBUG ql.Driver: Encoding valid txns info 
> 9223372036854775807:
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Starting Semantic 
> Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed phase 1 of 
> Semantic Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for source 
> tables
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> subqueries
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> destination tables
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #194
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #194
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #195
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #195
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 1ms
> 16/06/10 22:37:37 [main]: DEBUG hdfs.DFSClient: 
> /tmp/hive/hrt_qa/d2568b75-6399-46df-82b9-34ec445e8f64/hive_2016-06-10_22-37-37_392_2780828105665881901-1:
>  masked=rwx--
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #196
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #196
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: mkdirs took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #197
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 

[jira] [Updated] (HIVE-14022) left semi join throws SemanticException if where clause contains columnname with table alias

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14022:
---
Affects Version/s: 2.2.0

> left semi join throws SemanticException if where clause contains columnname 
> with table alias
> 
>
> Key: HIVE-14022
> URL: https://issues.apache.org/jira/browse/HIVE-14022
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jagruti Varia
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14022.1.patch, HIVE-14022.patch
>
>
> Left semi join throws following error if where clause contains column name 
> with table alias
> {noformat}
> select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016;
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO ql.Driver: We are setting the hadoop caller 
> context from  to hrt_qa_20160610223737_c3821398-d8df-44d8-9dd5-e66c9b7ed7c7
> 16/06/10 22:37:37 [main]: DEBUG parse.VariableSubstitution: Substitution is 
> on: select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parsing command: select * 
> from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parse Completed
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  start=1465598257393 end=1465598257397 duration=4 
> from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: DEBUG ql.Driver: Encoding valid txns info 
> 9223372036854775807:
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Starting Semantic 
> Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed phase 1 of 
> Semantic Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for source 
> tables
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> subqueries
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> destination tables
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #194
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #194
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #195
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #195
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath 
> took 1ms
> 16/06/10 22:37:37 [main]: DEBUG hdfs.DFSClient: 
> /tmp/hive/hrt_qa/d2568b75-6399-46df-82b9-34ec445e8f64/hive_2016-06-10_22-37-37_392_2780828105665881901-1:
>  masked=rwx--
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #196
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
> ipc.Client: IPC Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value 
> #196
> 16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: mkdirs took 2ms
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #197
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> 

[jira] [Updated] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14021:
---
Status: Patch Available  (was: In Progress)

> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14021:
---
Attachment: HIVE-14021.patch

> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14021 started by Jesus Camacho Rodriguez.
--
> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14022) left semi join throws SemanticException if where clause contains columnname with table alias

2016-06-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334756#comment-15334756
 ] 

Hive QA commented on HIVE-14022:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/1282/HIVE-14022.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10234 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_repair
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_table_nonprintable
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testTableCheck
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/141/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/141/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-141/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 1282 - PreCommit-HIVE-MASTER-Build

> left semi join throws SemanticException if where clause contains columnname 
> with table alias
> 
>
> Key: HIVE-14022
> URL: https://issues.apache.org/jira/browse/HIVE-14022
> Project: Hive
>  Issue Type: Bug
>Reporter: Jagruti Varia
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0
>
> Attachments: HIVE-14022.1.patch, HIVE-14022.patch
>
>
> Left semi join throws following error if where clause contains column name 
> with table alias
> {noformat}
> select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016;
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO ql.Driver: We are setting the hadoop caller 
> context from  to hrt_qa_20160610223737_c3821398-d8df-44d8-9dd5-e66c9b7ed7c7
> 16/06/10 22:37:37 [main]: DEBUG parse.VariableSubstitution: Substitution is 
> on: select * from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parsing command: select * 
> from src_emptybucket_partitioned_1 e1 left semi join 
> src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
> e3.year1=2016
> 16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parse Completed
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  start=1465598257393 end=1465598257397 duration=4 
> from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: DEBUG ql.Driver: Encoding valid txns info 
> 9223372036854775807:
> 16/06/10 22:37:37 [main]: INFO log.PerfLogger:  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Starting Semantic 
> Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed phase 1 of 
> Semantic Analysis
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for source 
> tables
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> subqueries
> 16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
> destination tables
> 16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
> Client (147022238) connection to 
> jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #194
> 16/06/10 22:37:37 [IPC Client (147022238) connection to 
> 

[jira] [Commented] (HIVE-14014) zero length file is being created for empty bucket in tez mode (II)

2016-06-16 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334728#comment-15334728
 ] 

Pengcheng Xiong commented on HIVE-14014:


[~ashutoshc]. the failures are reproducible. 0.3 is checking outPaths[idx] != 
null, while 0.2 is using filesCreated. Thanks. 

> zero length file is being created for empty bucket in tez mode (II)
> ---
>
> Key: HIVE-14014
> URL: https://issues.apache.org/jira/browse/HIVE-14014
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14014.01.patch, HIVE-14014.02.patch, 
> HIVE-14014.03.patch
>
>
> The same problem happens when source table is not empty, e.g,, when "limit 0" 
> is not there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-06-16 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334724#comment-15334724
 ] 

Szehon Ho commented on HIVE-13590:
--

+1 on my side.

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13590.1.patch, HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-06-16 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-13590:
---
Attachment: HIVE-13590.1.patch

Thanks [~szehon] for review. It is a good catch, I initially tried to use the 
threadLocal authenticationMethod and later found it was not suitable, but 
forgot to remove the get method.
Upload a new patch removing this unnecessary method.

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13590.1.patch, HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14014) zero length file is being created for empty bucket in tez mode (II)

2016-06-16 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334711#comment-15334711
 ] 

Ashutosh Chauhan commented on HIVE-14014:
-

I don't see any difference in .2 & .3 patch. If failures are not reproducible.. 
you may check it in without waiting for QA run.

> zero length file is being created for empty bucket in tez mode (II)
> ---
>
> Key: HIVE-14014
> URL: https://issues.apache.org/jira/browse/HIVE-14014
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14014.01.patch, HIVE-14014.02.patch, 
> HIVE-14014.03.patch
>
>
> The same problem happens when source table is not empty, e.g,, when "limit 0" 
> is not there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334702#comment-15334702
 ] 

Siddharth Seth commented on HIVE-14023:
---

There was a series of bugs around this, and I believe they were resolved.
Any idea on accessing the query string from the conf, is it stored ?

RB: https://reviews.apache.org/r/48818/

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Attachment: HIVE-13985-branch-1.patch

With regen of protobuf code.

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, HIVE-13985.2.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7868) AvroSerDe error handling could be improved

2016-06-16 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334696#comment-15334696
 ] 

Anthony Hsu commented on HIVE-7868:
---

I just tested with Hive 1.2.1 and *am* able to fix bad schema literals and 
URLs. I only had issues fixing the bad schema literals and URLs when I 
backported this patch to Hive 0.13.1. Looks like I may be missing some other 
patch that's needed for allowing fixing of bad schema literals and URLs.

> AvroSerDe error handling could be improved
> --
>
> Key: HIVE-7868
> URL: https://issues.apache.org/jira/browse/HIVE-7868
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Ferdinand Xu
> Fix For: 1.1.0
>
> Attachments: HIVE-7868.1.patch, HIVE-7868.2.patch
>
>
> When an Avro schema is invalid, AvroSerDe returns an error message instead of 
> throwing an exception. This is described in 
> {{AvroSerdeUtils.determineSchemaOrReturnErrorSchema}}:
> {noformat}
>   /**
>* Attempt to determine the schema via the usual means, but do not throw
>* an exception if we fail.  Instead, signal failure via a special
>* schema.  This is used because Hive calls init on the serde during
>* any call, including calls to update the serde properties, meaning
>* if the serde is in a bad state, there is no way to update that state.
>*/
> {noformat}
> I believe we should find a way to provide a better experience to our users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-06-16 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334695#comment-15334695
 ] 

Szehon Ho commented on HIVE-13590:
--

Change looks good to me, but are the changes in HadoopThriftAuthBridge needed?  
For example  "HadoopThriftAuthBridge::getAuthenticationMethod()" doesn't seem 
to be used as far as I can tell..

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7443) Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs

2016-06-16 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334650#comment-15334650
 ] 

Aihua Xu commented on HIVE-7443:


Attached the patch-2. Seems the first patch should work as well. I tested with 
oracle kinit and then ibm java, which won't work. But both IBM kinit and java 
will work. I moved the similar logic to hive-shim. 

> Fix HiveConnection to communicate with Kerberized Hive JDBC server and 
> alternative JDKs
> ---
>
> Key: HIVE-7443
> URL: https://issues.apache.org/jira/browse/HIVE-7443
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Security
>Affects Versions: 0.12.0, 0.13.1
> Environment: Kerberos
> Run Hive server2 and client with IBM JDK7.1
>Reporter: Yu Gao
>Assignee: Aihua Xu
> Attachments: HIVE-7443.2.patch, HIVE-7443.patch
>
>
> Hive Kerberos authentication has been enabled in my cluster. I ran kinit to 
> initialize the current login user's ticket cache successfully, and then tried 
> to use beeline to connect to Hive Server2, but failed. After I manually added 
> some logging to catch the failure exception, this is what I got that caused 
> the failure:
> beeline>  !connect 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
>  org.apache.hive.jdbc.HiveDriver
> scan complete in 2ms
> Connecting to 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
> Enter password for 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM:
> 14/07/17 15:12:45 ERROR jdbc.HiveConnection: Failed to open client transport
> javax.security.sasl.SaslException: Failed to open client transport [Caused by 
> java.io.IOException: Could not instantiate SASL transport]
> at 
> org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:78)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:342)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:200)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:178)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
> at java.sql.DriverManager.getConnection(DriverManager.java:582)
> at java.sql.DriverManager.getConnection(DriverManager.java:198)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:186)
> at org.apache.hive.beeline.Commands.connect(Commands.java:959)
> at org.apache.hive.beeline.Commands.connect(Commands.java:880)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:44)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:801)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.IOException: Could not instantiate SASL transport
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Client.createClientTransport(HadoopThriftAuthBridge20S.java:177)
> at 
> org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:74)
> ... 24 more
> Caused by: javax.security.sasl.SaslException: Failure to initialize security 
> context [Caused by org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> minor string: SubjectCredFinder: no JAAS Subject]
> at 
> com.ibm.security.sasl.gsskerb.GssKrb5Client.(GssKrb5Client.java:131)
> at 
> com.ibm.security.sasl.gsskerb.FactoryImpl.createSaslClient(FactoryImpl.java:53)
> at javax.security.sasl.Sasl.createSaslClient(Sasl.java:362)
> at 
> org.apache.thrift.transport.TSaslClientTransport.(TSaslClientTransport.java:72)
> at 
> 

[jira] [Updated] (HIVE-7443) Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs

2016-06-16 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-7443:
---
Attachment: HIVE-7443.2.patch

> Fix HiveConnection to communicate with Kerberized Hive JDBC server and 
> alternative JDKs
> ---
>
> Key: HIVE-7443
> URL: https://issues.apache.org/jira/browse/HIVE-7443
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Security
>Affects Versions: 0.12.0, 0.13.1
> Environment: Kerberos
> Run Hive server2 and client with IBM JDK7.1
>Reporter: Yu Gao
>Assignee: Aihua Xu
> Attachments: HIVE-7443.2.patch, HIVE-7443.patch
>
>
> Hive Kerberos authentication has been enabled in my cluster. I ran kinit to 
> initialize the current login user's ticket cache successfully, and then tried 
> to use beeline to connect to Hive Server2, but failed. After I manually added 
> some logging to catch the failure exception, this is what I got that caused 
> the failure:
> beeline>  !connect 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
>  org.apache.hive.jdbc.HiveDriver
> scan complete in 2ms
> Connecting to 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
> Enter password for 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM:
> 14/07/17 15:12:45 ERROR jdbc.HiveConnection: Failed to open client transport
> javax.security.sasl.SaslException: Failed to open client transport [Caused by 
> java.io.IOException: Could not instantiate SASL transport]
> at 
> org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:78)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:342)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:200)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:178)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
> at java.sql.DriverManager.getConnection(DriverManager.java:582)
> at java.sql.DriverManager.getConnection(DriverManager.java:198)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:186)
> at org.apache.hive.beeline.Commands.connect(Commands.java:959)
> at org.apache.hive.beeline.Commands.connect(Commands.java:880)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:44)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:801)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.io.IOException: Could not instantiate SASL transport
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Client.createClientTransport(HadoopThriftAuthBridge20S.java:177)
> at 
> org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:74)
> ... 24 more
> Caused by: javax.security.sasl.SaslException: Failure to initialize security 
> context [Caused by org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> minor string: SubjectCredFinder: no JAAS Subject]
> at 
> com.ibm.security.sasl.gsskerb.GssKrb5Client.(GssKrb5Client.java:131)
> at 
> com.ibm.security.sasl.gsskerb.FactoryImpl.createSaslClient(FactoryImpl.java:53)
> at javax.security.sasl.Sasl.createSaslClient(Sasl.java:362)
> at 
> org.apache.thrift.transport.TSaslClientTransport.(TSaslClientTransport.java:72)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Client.createClientTransport(HadoopThriftAuthBridge20S.java:169)
> ... 25 more
> Caused by: org.ietf.jgss.GSSException, major code: 13, minor code: 0
> major string: Invalid credentials
> 

[jira] [Commented] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334616#comment-15334616
 ] 

Sergey Shelukhin commented on HIVE-14023:
-

I think I saw queryID being reused bug somewhere... I thiought that was fixed. 
[~vikram.dixit] and [~hagleitn] might know better... 

Can you post a RB?

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14023:
--
Status: Patch Available  (was: Open)

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14023:
--
Attachment: HIVE-14023.01.patch

Patch to propagate the queryId to LLAP.

While testing this, I noticed that the same queryId was used across multiple 
Dags. This happened in a couple of case. This was while running 
TestMiniLlapCliDriver. Will try again to see what's going on and create a jira. 
Thought the queryId was supposed to be unique.

[~sershe] - please review.
Also, to investigate queryId not being unique, do you know whether the query 
string is accessible from within the TezProcessor (from the conf) ?

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-16 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14023:
--
Target Version/s: 2.1.1  (was: 2.1.0)

> LLAP: Make the Hive query id available in ContainerRunner
> -
>
> Key: HIVE-14023
> URL: https://issues.apache.org/jira/browse/HIVE-14023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14023.01.patch
>
>
> Needed to generate logs per query.
> We can use the dag identifier for now, but that isn't very useful. (The 
> queryId may not be too useful either if users cannot find it - but that's 
> better than a dagIdentifier)
> The queryId is available right now after the Processor starts, which is too 
> late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13985:
-
Attachment: HIVE-13985-branch-1.patch

Made the default initial capacity for the cache to 1024.

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, 
> HIVE-13985.2.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334568#comment-15334568
 ] 

Sergey Shelukhin commented on HIVE-14029:
-

Maybe we'll also get newer Hadoop version in the damn tgz file so we can 
actually upgrade that too!

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14031) cleanup metadataReader in OrcEncodedDataReader

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334561#comment-15334561
 ] 

Sergey Shelukhin commented on HIVE-14031:
-

It should be

> cleanup metadataReader in OrcEncodedDataReader
> --
>
> Key: HIVE-14031
> URL: https://issues.apache.org/jira/browse/HIVE-14031
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14031.1.patch
>
>
> MetadataReader should be closed in OrcEncodedDataReader as a part of 
> cleanupReaders. 
> \cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334572#comment-15334572
 ] 

Sergey Shelukhin commented on HIVE-13985:
-

+1

> ORC improvements for reducing the file system calls in task side
> 
>
> Key: HIVE-13985
> URL: https://issues.apache.org/jira/browse/HIVE-13985
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13985-branch-1.patch, HIVE-13985-branch-1.patch, 
> HIVE-13985-branch-1.patch, HIVE-13985-branch-2.1.patch, HIVE-13985.1.patch, 
> HIVE-13985.2.patch
>
>
> HIVE-13840 fixed some issues with addition file system invocations during 
> split generation. Similarly, this jira will fix issues with additional file 
> system invocations on the task side. To avoid reading footers on the task 
> side, users can set hive.orc.splits.include.file.footer to true which will 
> serialize the orc footers on the splits. But this has issues with serializing 
> unwanted information like column statistics and other metadata which are not 
> really required for reading orc split on the task side. We can reduce the 
> payload on the orc splits by serializing only the minimum required 
> information (stripe information, types, compression details). This will 
> decrease the payload on the orc splits and can potentially avoid OOMs in 
> application master (AM) during split generation. This jira also address other 
> issues concerning the AM cache. The local cache used by AM is soft reference 
> cache. This can introduce unpredictability across multiple runs of the same 
> query. We can cache the serialized footer in the local cache and also use 
> strong reference cache which should avoid memory pressure and will have 
> better predictability.
> One other improvement that we can do is when 
> hive.orc.splits.include.file.footer is set to false, on the task side we make 
> one additional file system call to know the size of the file. If we can 
> serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14001) beeline doesn't give out an error when takes either "-e" or "-f" in command instead of both

2016-06-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334549#comment-15334549
 ] 

Hive QA commented on HIVE-14001:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12811091/HIVE-14001.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10235 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/140/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/140/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-140/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12811091 - PreCommit-HIVE-MASTER-Build

> beeline doesn't give out an error when takes either "-e" or "-f" in command 
> instead of both
> ---
>
> Key: HIVE-14001
> URL: https://issues.apache.org/jira/browse/HIVE-14001
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 0.10.0, 2.0.1
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Trivial
> Attachments: HIVE-14001.2.patch, HIVE-14001.patch
>
>
> When providing both arguments there should be an error message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >