[jira] [Commented] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086907#comment-16086907
 ] 

Lefty Leverenz commented on HIVE-17086:
---

Doc note:  The new metric *LlapDaemonLimitFileDescriptorCount* and revised 
meaning of *LlapDaemonMaxFileDescriptorCount* need to be documented in the Hive 
Metrics wiki page, along with other LLAP metrics (see HIVE-16072).

* [Hive Metrics | https://cwiki.apache.org/confluence/display/Hive/Hive+Metrics]

Added a TODOC3.0 label.

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17086.1.patch, HIVE-17086.2.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.

2017-07-13 Thread Bing Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086900#comment-16086900
 ] 

Bing Li commented on HIVE-4577:
---

[~vgumashta] the failures in build#6010 should not be caused by this patch. 
Thanks.

> hive CLI can't handle hadoop dfs command  with space and quotes.
> 
>
> Key: HIVE-4577
> URL: https://issues.apache.org/jira/browse/HIVE-4577
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, 
> HIVE-4577.3.patch.txt, HIVE-4577.4.patch, HIVE-4577.5.patch, HIVE-4577.6.patch
>
>
> As design, hive could support hadoop dfs command in hive shell, like 
> hive> dfs -mkdir /user/biadmin/mydir;
> but has different behavior with hadoop if the path contains space and quotes
> hive> dfs -mkdir "hello"; 
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
> /user/biadmin/"hello"
> hive> dfs -mkdir 'world';
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
> /user/biadmin/'world'
> hive> dfs -mkdir "bei jing";
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/"bei
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/jing"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16072) LLAP: Add some additional jvm metrics for hadoop-metrics2

2017-07-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16072:
--
Labels: TODOC2.3  (was: )

> LLAP: Add some additional jvm metrics for hadoop-metrics2 
> --
>
> Key: HIVE-16072
> URL: https://issues.apache.org/jira/browse/HIVE-16072
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC2.3
> Fix For: 2.2.0
>
> Attachments: HIVE-16072.1.patch, HIVE-16072.2.patch
>
>
> It will be helpful for debugging to expose some metrics like buffer pool, 
> file descriptors etc. that are not exposed via Hadoop's JvmMetrics. We 
> already a /jmx endpoint that gives out these info but we don't know the 
> timestamp of allocations, number file descriptors to correlated with the 
> logs. This will better suited for graphing tools. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17097:

Attachment: HIVE-17097.2.patch

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch, HIVE-17097.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17097:

Attachment: (was: HIVE-17097.2.patch)

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17097:

Attachment: HIVE-17097.2.patch

Thanks [~gopalv], [~djaiswal]. Added {{SemiJoinHint::toString}} in .2 patch.

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch, HIVE-17097.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086881#comment-16086881
 ] 

Deepak Jaiswal commented on HIVE-17097:
---

+1. Thanks for the fix.

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086879#comment-16086879
 ] 

Gopal V commented on HIVE-17097:


LGTM - +1 

The log lines don't print the SemiJoinHint fully, to help debug this sort of 
issue, it would be important to fix SemiJoinHint::toString()

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-17097:
--

Assignee: Rajesh Balamohan

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-17096.

Resolution: Fixed

> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086875#comment-16086875
 ] 

Rajesh Balamohan edited comment on HIVE-17097 at 7/14/17 5:16 AM:
--

When using hints like {{/*+ semi(store_sales, ss_ticket_number, 1696809897)  
*/}},  {{SemanticAnalyzer::parseSingleSemiJoinHint}} ended up parsing 
{{1696809897}} as the target column. Fix is to check "target" for numeric 
values.

\cc [~gopalv], [~djaiswal]


was (Author: rajesh.balamohan):
When using hints like {{/*+ semi(store_sales, ss_ticket_number, 1696809897)  
*/}},  {{SemanticAnalyzer::parseSingleSemiJoinHint}} ended up parsing 
1696809897 as the target column. Fix is to check "target" for numeric values.

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17097:

Status: Patch Available  (was: Open)

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17097) Fix SemiJoinHint parsing in SemanticAnalyzer

2017-07-13 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17097:

Attachment: HIVE-17097.1.patch

When using hints like {{/*+ semi(store_sales, ss_ticket_number, 1696809897)  
*/}},  {{SemanticAnalyzer::parseSingleSemiJoinHint}} ended up parsing 
1696809897 as the target column. Fix is to check "target" for numeric values.

> Fix SemiJoinHint parsing in SemanticAnalyzer
> 
>
> Key: HIVE-17097
> URL: https://issues.apache.org/jira/browse/HIVE-17097
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17097.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086872#comment-16086872
 ] 

Pengcheng Xiong commented on HIVE-17096:


update the q files.

> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17085) ORC file merge/concatenation should do full schema check

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086859#comment-16086859
 ] 

Hive QA commented on HIVE-17085:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877180/HIVE-17085.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10895 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat_schema]
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat_writer_version]
 (batchId=81)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6023/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6023/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6023/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877180 - PreCommit-HIVE-Build

> ORC file merge/concatenation should do full schema check
> 
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0, 2.3.0, 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17085.1.patch
>
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
>   ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of 
> index entries found: 0 expected: 1
>   at 
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
>   at 
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
>   at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
>   at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
>   ... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently 
> checks only file version match).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17005) Ensure REPL DUMP and REPL LOAD are authorized properly

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086808#comment-16086808
 ] 

Hive QA commented on HIVE-17005:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877175/HIVE-17005.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10893 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=226)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6022/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6022/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6022/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877175 - PreCommit-HIVE-Build

> Ensure REPL DUMP and REPL LOAD are authorized properly
> --
>
> Key: HIVE-17005
> URL: https://issues.apache.org/jira/browse/HIVE-17005
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-17005.2.patch, HIVE-17005.patch
>
>
> Currently, we piggyback REPL DUMP and REPL LOAD on EXPORT and IMPORT auth 
> privileges. However, work is on to not populate all the relevant objects in 
> inputObjs and outputObjs, which then requires that REPL DUMP and REPL LOAD be 
> authorized at a higher level, and simply require ADMIN_PRIV to run,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17096) Fix test failures in 2.3 branch

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17096:
--


> Fix test failures in 2.3 branch
> ---
>
> Key: HIVE-17096
> URL: https://issues.apache.org/jira/browse/HIVE-17096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17095) Long chain repl loads do not complete in a timely fashion

2017-07-13 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-17095:

Attachment: HIVE-17095.patch

Patch attached.

> Long chain repl loads do not complete in a timely fashion
> -
>
> Key: HIVE-17095
> URL: https://issues.apache.org/jira/browse/HIVE-17095
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, repl
>Reporter: sapin amin
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-17095.patch
>
>
> Per performance testing done by [~sapinamin] (thus, I'm setting him as 
> reporter), we were able to discover an important bug affecting replication. 
> It has the potential to affect other large DAGs of Tasks that hive generates 
> as well, if those DAGs have multiple paths to child Task nodes.
> Basically, we find that incremental REPL LOAD does not finish in a timely 
> fashion. The test, in this case was to add 400 partitions, and replicate 
> them. Associated with each partition, there was an ADD PTN and a ALTER PTN. 
> For each of the ADD PTN tasks, we'd generate a DDLTask, a CopyTask and a 
> MoveTask. For each Alter ptn, there'd be a single DDLTask. And order of 
> execution is important, so it would chain in dependency collection tasks 
> between phases.
> Trying to root cause this shows us that it seems to stall forever at the 
> Driver instantiation time, and it almost looks like the thread doesn't 
> proceed past that point.
> Looking at logs, it seems that the way this is written, it looks for all 
> tasks generated that are subtrees of all nodes, without looking for 
> duplicates, and this is done simply to get the number of execution tasks!
> And thus, the task visitor will visit every subtree of every node, which is 
> fine if you have graphs that look like open trees, but is horrible for us, 
> since we have dependency collection tasks between each phase. Effectively, 
> this is what's happening:
> We have a DAG, say, like this:
> 4 tasks in parallel -> DEP col -> 4 tasks in parallel -> DEP col -> ...
> This means that for each of the 4 root tasks, we will do a full traversal of 
> every graph (not just every node) past the DEP col, and this happens 
> recursively, and this leads to an exponential growth of number of tasks 
> visited as the length and breadth of the graph increase. In our case, we had 
> about 800 tasks in the graph, with roughly a width of about 2-3, with 200 
> stages, a dep collection before and after, and this meant that leaf nodes of 
> this DAG would have something like 2^200 - 3^200 ways in which they can be 
> visited, and thus, we'd visit them in all those ways. And all this simply to 
> count the number of tasks to schedule - we would revisit this function 
> multiple more times, once per each hook, once for the MapReduceCompiler and 
> once for the TaskCompiler.
> We have not been sending such large DAGs to the Driver, thus it has not yet 
> been a problem, and there are upcoming changes to reduce the number of tasks 
> replication generates(as part of a memory addressing issue), but we still 
> should fix the way we do Task traversal so that a large DAG cannot cripple us.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17095) Long chain repl loads do not complete in a timely fashion

2017-07-13 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan reassigned HIVE-17095:
---


> Long chain repl loads do not complete in a timely fashion
> -
>
> Key: HIVE-17095
> URL: https://issues.apache.org/jira/browse/HIVE-17095
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, repl
>Reporter: sapin amin
>Assignee: Sushanth Sowmyan
>
> Per performance testing done by [~sapinamin] (thus, I'm setting him as 
> reporter), we were able to discover an important bug affecting replication. 
> It has the potential to affect other large DAGs of Tasks that hive generates 
> as well, if those DAGs have multiple paths to child Task nodes.
> Basically, we find that incremental REPL LOAD does not finish in a timely 
> fashion. The test, in this case was to add 400 partitions, and replicate 
> them. Associated with each partition, there was an ADD PTN and a ALTER PTN. 
> For each of the ADD PTN tasks, we'd generate a DDLTask, a CopyTask and a 
> MoveTask. For each Alter ptn, there'd be a single DDLTask. And order of 
> execution is important, so it would chain in dependency collection tasks 
> between phases.
> Trying to root cause this shows us that it seems to stall forever at the 
> Driver instantiation time, and it almost looks like the thread doesn't 
> proceed past that point.
> Looking at logs, it seems that the way this is written, it looks for all 
> tasks generated that are subtrees of all nodes, without looking for 
> duplicates, and this is done simply to get the number of execution tasks!
> And thus, the task visitor will visit every subtree of every node, which is 
> fine if you have graphs that look like open trees, but is horrible for us, 
> since we have dependency collection tasks between each phase. Effectively, 
> this is what's happening:
> We have a DAG, say, like this:
> 4 tasks in parallel -> DEP col -> 4 tasks in parallel -> DEP col -> ...
> This means that for each of the 4 root tasks, we will do a full traversal of 
> every graph (not just every node) past the DEP col, and this happens 
> recursively, and this leads to an exponential growth of number of tasks 
> visited as the length and breadth of the graph increase. In our case, we had 
> about 800 tasks in the graph, with roughly a width of about 2-3, with 200 
> stages, a dep collection before and after, and this meant that leaf nodes of 
> this DAG would have something like 2^200 - 3^200 ways in which they can be 
> visited, and thus, we'd visit them in all those ways. And all this simply to 
> count the number of tasks to schedule - we would revisit this function 
> multiple more times, once per each hook, once for the MapReduceCompiler and 
> once for the TaskCompiler.
> We have not been sending such large DAGs to the Driver, thus it has not yet 
> been a problem, and there are upcoming changes to reduce the number of tasks 
> replication generates(as part of a memory addressing issue), but we still 
> should fix the way we do Task traversal so that a large DAG cannot cripple us.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086775#comment-16086775
 ] 

ZhangBing Lin commented on HIVE-17094:
--

Submit a patch!

> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-17094.1.patch
>
>
> Java enumeration type is a static constant, implicitly modified with static 
> final,Modifier 'static' is redundant for inner enums less.So I suggest 
> deleting the 'static' modifier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17094:
-
Status: Patch Available  (was: Open)

> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-17094.1.patch
>
>
> Java enumeration type is a static constant, implicitly modified with static 
> final,Modifier 'static' is redundant for inner enums less.So I suggest 
> deleting the 'static' modifier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17094:
-
Description: Java enumeration type is a static constant, implicitly 
modified with static final,Modifier 'static' is redundant for inner enums 
less.So I suggest deleting the 'static' modifier.

> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-17094.1.patch
>
>
> Java enumeration type is a static constant, implicitly modified with static 
> final,Modifier 'static' is redundant for inner enums less.So I suggest 
> deleting the 'static' modifier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17094:
-
Affects Version/s: 3.0.0

> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-17094.1.patch
>
>
> Java enumeration type is a static constant, implicitly modified with static 
> final,Modifier 'static' is redundant for inner enums less.So I suggest 
> deleting the 'static' modifier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-17094:
-
Attachment: HIVE-17094.1.patch

> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-17094.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17094) Modifier 'static' is redundant for inner enums

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin reassigned HIVE-17094:



> Modifier 'static' is redundant for inner enums
> --
>
> Key: HIVE-17094
> URL: https://issues.apache.org/jira/browse/HIVE-17094
> Project: Hive
>  Issue Type: Improvement
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16366) Hive 2.3 release planning

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086762#comment-16086762
 ] 

Hive QA commented on HIVE-16366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865426/HIVE-16366-branch-2.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10549 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=137)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=174)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6021/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6021/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6021/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865426 - PreCommit-HIVE-Build

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-13 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Attachment: HIVE-12631.22.patch

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, 
> HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, 
> HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, 
> HIVE-12631.22.patch, HIVE-12631.2.patch, HIVE-12631.3.patch, 
> HIVE-12631.4.patch, HIVE-12631.5.patch, HIVE-12631.6.patch, 
> HIVE-12631.7.patch, HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16824) PrimaryToReplicaResourceFunctionTest.java has missed the ASF header

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-16824:
-
Description: PrimaryToReplicaResourceFunctionTest.java has missed the ASF 
header  (was: PrimaryToReplicaResourceFunctionTest.java lack the ASF Headers)

> PrimaryToReplicaResourceFunctionTest.java has missed the ASF header
> ---
>
> Key: HIVE-16824
> URL: https://issues.apache.org/jira/browse/HIVE-16824
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-16824.1.patch
>
>
> PrimaryToReplicaResourceFunctionTest.java has missed the ASF header



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16824) PrimaryToReplicaResourceFunctionTest.java has missed the ASF header

2017-07-13 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086751#comment-16086751
 ] 

ZhangBing Lin commented on HIVE-16824:
--

[~lirui],can you plz take a quick review?

> PrimaryToReplicaResourceFunctionTest.java has missed the ASF header
> ---
>
> Key: HIVE-16824
> URL: https://issues.apache.org/jira/browse/HIVE-16824
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-16824.1.patch
>
>
> PrimaryToReplicaResourceFunctionTest.java lack the ASF Headers



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16824) PrimaryToReplicaResourceFunctionTest.java has missed the ASF header

2017-07-13 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated HIVE-16824:
-
Summary: PrimaryToReplicaResourceFunctionTest.java has missed the ASF 
header  (was: PrimaryToReplicaResourceFunctionTest.java lack the ASF Headers)

> PrimaryToReplicaResourceFunctionTest.java has missed the ASF header
> ---
>
> Key: HIVE-16824
> URL: https://issues.apache.org/jira/browse/HIVE-16824
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: HIVE-16824.1.patch
>
>
> PrimaryToReplicaResourceFunctionTest.java lack the ASF Headers



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17093) LLAP ssl configs need to be localized to talk to a wire encrypted hdfs

2017-07-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086741#comment-16086741
 ] 

Gopal V commented on HIVE-17093:


bq.  I don't think this breaks anything for the LLAP UI wire encryption case 

If I'm not wrong the ssl-server is used for the secure shuffle handler.

> LLAP ssl configs need to be localized to talk to a wire encrypted hdfs
> --
>
> Key: HIVE-17093
> URL: https://issues.apache.org/jira/browse/HIVE-17093
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-17093.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17086.1.patch, HIVE-17086.2.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086726#comment-16086726
 ] 

Hive QA commented on HIVE-17086:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877184/HIVE-17086.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10891 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6020/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6020/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6020/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877184 - PreCommit-HIVE-Build

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch, HIVE-17086.2.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17093) LLAP ssl configs need to be localized to talk to a wire encrypted hdfs

2017-07-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-17093:
--
Status: Patch Available  (was: Open)

> LLAP ssl configs need to be localized to talk to a wire encrypted hdfs
> --
>
> Key: HIVE-17093
> URL: https://issues.apache.org/jira/browse/HIVE-17093
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-17093.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17093) LLAP ssl configs need to be localized to talk to a wire encrypted hdfs

2017-07-13 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-17093:
--
Attachment: HIVE-17093.01.patch

The patch localizes ssl-client.xml instead of ssl-server.xml which is used by 
NNs/DNs etc.
Also, it stops loading the configs, since they are loaded by relevant sections 
of DFSClient when required.

[~gopalv] - can you please take a look. I don't think this breaks anything for 
the LLAP UI wire encryption case (already broken?)

> LLAP ssl configs need to be localized to talk to a wire encrypted hdfs
> --
>
> Key: HIVE-17093
> URL: https://issues.apache.org/jira/browse/HIVE-17093
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-17093.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2017-07-13 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14156 started by Bing Li.
--
> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17087) HoS Query with multiple Partition Pruning Sinks + subquery has incorrect explain

2017-07-13 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086700#comment-16086700
 ] 

Sahil Takiar commented on HIVE-17087:
-

Have a fix ready to go, just waiting on HIVE-17090 before posting the patch.

> HoS Query with multiple Partition Pruning Sinks + subquery has incorrect 
> explain
> 
>
> Key: HIVE-17087
> URL: https://issues.apache.org/jira/browse/HIVE-17087
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Ran the following query in the {{TestSparkCliDriver}}:
> {code:sql}
> set hive.spark.dynamic.partition.pruning=true;
> set hive.auto.convert.join=true;
> create table partitioned_table1 (col int) partitioned by (part_col int);
> create table partitioned_table2 (col int) partitioned by (part_col int);
> create table regular_table (col int);
> insert into table regular_table values (1);
> alter table partitioned_table1 add partition (part_col = 1);
> insert into table partitioned_table1 partition (part_col = 1) values (1), 
> (2), (3), (4), (5), (6), (7), (8), (9), (10);
> alter table partitioned_table2 add partition (part_col = 1);
> insert into table partitioned_table2 partition (part_col = 1) values (1), 
> (2), (3), (4), (5), (6), (7), (8), (9), (10);
> explain select * from partitioned_table1 where partitioned_table1.part_col in 
> (select regular_table.col from regular_table join partitioned_table2 on 
> regular_table.col = partitioned_table2.part_col);
> {code}
> and got the following explain plan:
> {code}
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-4 depends on stages: Stage-2
>   Stage-5 depends on stages: Stage-4
>   Stage-3 depends on stages: Stage-5
>   Stage-1 depends on stages: Stage-3
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 4 
> Map Operator Tree:
> TableScan
>   alias: partitioned_table1
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: col (type: int), part_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   expressions: _col1 (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> keys: _col0 (type: int)
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
> Spark Partition Pruning Sink Operator
>   partition key expr: part_col
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   target column name: part_col
>   target work: Map 3
>   Stage: Stage-4
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: regular_table
>   Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE 
> Column stats: NONE
>   Filter Operator
> predicate: col is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   expressions: col (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark HashTable Sink Operator
> keys:
>   0 _col0 (type: int)
>   1 _col0 (type: int)
>   Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> 

[jira] [Commented] (HIVE-17091) "Timed out getting readerEvents" error from external LLAP client

2017-07-13 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086699#comment-16086699
 ] 

Jason Dere commented on HIVE-17091:
---

Looks like a couple different issues at play here:
1) On the LLAP daemon, the executor finished but it's somehow still stuck 
waiting for the ChannelOutputStream to finish all writes (even though all of 
the data was already received by the client). This might be related to the 
pendingWrites/writeMonitor logic being used by the ChannelOutputStream to 
manage the number of outstanding writes for an external fragment request. I've 
tried replacing this mechanism with a Sempaphore, and so far I haven't seen 
this issue reoccur.
{noformat}
Thread 1802 (TezTR-683826_93_0_0_29_0):
  State: WAITING
  Blocked count: 456
  Wtaited count: 458
  Waiting on java.lang.Object@7e3b8b1
  Stack:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)

org.apache.hadoop.hive.llap.ChannelOutputStream.waitForWritesToFinish(ChannelOutputStream.java:153)

org.apache.hadoop.hive.llap.ChannelOutputStream.close(ChannelOutputStream.java:136)
java.io.FilterOutputStream.close(FilterOutputStream.java:159)

org.apache.hadoop.hive.llap.io.ChunkedOutputStream.close(ChunkedOutputStream.java:81)
java.io.FilterOutputStream.close(FilterOutputStream.java:159)
org.apache.hadoop.hive.llap.LlapRecordWriter.close(LlapRecordWriter.java:47)

org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.close(HivePassThroughRecordWriter.java:46)

org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:190)

org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:1039)
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711)
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711)
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711)
org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:711)

org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:464)

org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:206)
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)

org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
{noformat}

2) The LLAP client received the end of the data stream and is expecting a 
heartbeat with a task complete notification:
{noformat}
05:47:44,060 DEBUG 
org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient: Heartbeat from 
attempt_7085310350540683826_0089_0_00_33_0 events: 1
2017-06-29 05:47:44,060 DEBUG 
org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient: Task update 
event for attempt_7085310350540683826_0089_0_00_33_0
2017-06-29 05:47:44,065 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Chunk size 131072
2017-06-29 05:47:44,081 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Chunk size 131072
2017-06-29 05:47:44,097 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Chunk size 131072
2017-06-29 05:47:44,119 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Chunk size 30244
2017-06-29 05:47:44,123 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Chunk size 0
2017-06-29 05:47:44,123 DEBUG 
org.apache.hadoop.hive.llap.io.ChunkedInputStream: 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0):
 Hit end of data
2017-06-29 05:47:44,123 INFO org.apache.hadoop.hive.llap.LlapBaseRecordReader: 
1498729664123 Waiting for reader event for 
LlapTaskUmbilicalExternalClient(attempt_7085310350540683826_0089_0_00_33_0)
{noformat}

Due to issue 1 the task completed event never arrives at the client, though the 
client continues to receive heartbeats from LLAP. Eventually (after 30 
seconds), the external client times out waiting for the task completed event. 
I'm guessing the solution on the client side is that we shouldn't be timing out 
while waiting for the task complete event as long as we are still receiving 
hearbeats. I'll try setting it to an indefinite wait.
{noformat}
2017-06-29 05:48:14,106 DEBUG 
org.apache.hadoop.hive.llap.ext.LlapTaskUmbilicalExternalClient: Received 
heartbeat from container, request={  
containerId=container_7085310350540683826_0089_00_33, requestId=487, 
startIndex=0, 

[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086695#comment-16086695
 ] 

Pengcheng Xiong commented on HIVE-16997:


HLL dense register: // 2^p number of bytes for register, default p=14. that is, 
16384 bytes

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17018) Small table is converted to map join even the total size of small tables exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)

2017-07-13 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084774#comment-16084774
 ] 

liyunzhang_intel edited comment on HIVE-17018 at 7/14/17 1:19 AM:
--

[~csun]: 
{quote}
Yes. I think we don't need to change the existing behavior. I'm just suggesting 
that we might need a HoS specific config to replace 
hive.auto.convert.join.nonconditionaltask.size
{quote}
rename {{hive.auto.convert.join.nonconditionaltask.size}} to 
{{hive.auto.convert.join.within.sparktask.size}}? and the description of the 
configuration 
is changed from
{noformat} 
the sum of size for n-1 of the tables/partitions for a n-way join is smaller 
than it
{noformat}

to 
{noformat}

the sum of size for n-1 of the tables/partitions for a n-way join is smaller 
than it in 1 MapTask or ReduceTask
{noformat}


Can you give some suggestion?



was (Author: kellyzly):
[~csun]: 
{quote}
Yes. I think we don't need to change the existing behavior. I'm just suggesting 
that we might need a HoS specific config to replace 
hive.auto.convert.join.nonconditionaltask.size
{quote}
rename {{hive.auto.convert.join.nonconditionaltask.size}} to 
{{hive.auto.convert.join.within.sparktask.size}}? and the description of the 
configuration 
{noformat} is changed from
the sum of size for n-1 of the tables/partitions for a n-way join is smaller 
than it
{noformat}

to 
{noformat}

the sum of size for n-1 of the tables/partitions for a n-way join is smaller 
than it in 1 MapTask or ReduceTask
{noformat}


Can you give some suggestion?


> Small table is converted to map join even the total size of small tables 
> exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)
> -
>
> Key: HIVE-17018
> URL: https://issues.apache.org/jira/browse/HIVE-17018
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17018_data_init.q, HIVE-17018.q, t3.txt
>
>
>  we use "hive.auto.convert.join.noconditionaltask.size" as the threshold. it 
> means  the sum of size for n-1 of the tables/partitions for a n-way join is 
> smaller than it, it will be converted to a map join. for example, A join B 
> join C join D join E. Big table is A(100M), small tables are 
> B(10M),C(10M),D(10M),E(10M).  If we set 
> hive.auto.convert.join.noconditionaltask.size=20M. In current code, E,D,B 
> will be converted to map join but C will not be converted to map join. In my 
> understanding, because hive.auto.convert.join.noconditionaltask.size can only 
> contain E and D, so C and B should not be converted to map join.  
> Let's explain more why E can be converted to map join.
> in current code, 
> [SparkMapJoinOptimizer#getConnectedMapJoinSize|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L364]
>  calculates all the mapjoins  in the parent path and child path. The search 
> stops when encountering [UnionOperator or 
> ReduceOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L381].
>  Because C is not converted to map join because {{connectedMapJoinSize + 
> totalSize) > maxSize}} [see 
> code|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L330].The
>  RS before the join of C remains. When calculating whether B will be 
> converted to map join, {{getConnectedMapJoinSize}} returns 0 as encountering 
> [RS 
> |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#409]
>  and causes  {{connectedMapJoinSize + totalSize) < maxSize}} matches.
> [~xuefuz] or [~jxiang]: can you help see whether this is a bug or not  as you 
> are more familiar with SparkJoinOptimizer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17090) spark.only.query.files are not being run by ptest

2017-07-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17090:
---


> spark.only.query.files are not being run by ptest
> -
>
> Key: HIVE-17090
> URL: https://issues.apache.org/jira/browse/HIVE-17090
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Checked a recent run of Hive QA and it doesn't look like qtests specified in 
> spark.only.query.files are being run.
> I think some modifications to ptest config files are required to get this 
> working - e.g. the deployed master-m2.properties file for ptest should 
> contain mainProperties.${spark.only.query.files} in the 
> qFileTest.miniSparkOnYarn.groups.normal key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17083) DagUtils overwrites any credentials already added

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086676#comment-16086676
 ] 

Hive QA commented on HIVE-17083:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877164/HIVE-17083.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10892 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_decimal] 
(batchId=9)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_queries] 
(batchId=94)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6019/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6019/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6019/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877164 - PreCommit-HIVE-Build

> DagUtils overwrites any credentials already added
> -
>
> Key: HIVE-17083
> URL: https://issues.apache.org/jira/browse/HIVE-17083
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Josh Elser
>Assignee: Josh Elser
> Attachments: HIVE-17083.patch
>
>
> While working with a StorageHandler with hive.execution.engine=tez, I found 
> that the credentials the storage handler was adding were not propagating to 
> the dag.
> After a big of debugging/git-log, I found that DagUtils was overwriting the 
> credentials which were already set. A quick patch locally seem to make things 
> work again. Will put together a quick unit test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16997) Extend object store to store bit vectors

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086653#comment-16086653
 ] 

Pengcheng Xiong commented on HIVE-16997:


FM Sketch: max number of bit vector 1024 (1k), for each bit vector, it is like 
{0,1,2,31}, the length is 87. Thus, we need 87k for the worst case of FM 
Sketch.

> Extend object store to store bit vectors
> 
>
> Key: HIVE-16997
> URL: https://issues.apache.org/jira/browse/HIVE-16997
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086647#comment-16086647
 ] 

Pengcheng Xiong commented on HIVE-16907:


[~libing], the following jira may be helpful:
{code}
commit c23841e553cbd4f32d33842d49f9b9e52803d143
Author: Pengcheng Xiong 
Date:   Sun Oct 4 12:45:21 2015 -0700

HIVE-11699: Support special characters in quoted table names (Pengcheng 
Xiong, reviewed by John Pullokkaran)
{code}

>  "INSERT INTO"  overwrite old data when destination table encapsulated by 
> backquote 
> 
>
> Key: HIVE-16907
> URL: https://issues.apache.org/jira/browse/HIVE-16907
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.1.0, 2.1.1
>Reporter: Nemon Lou
>Assignee: Bing Li
> Attachments: HIVE-16907.1.patch
>
>
> A way to reproduce:
> {noformat}
> create database tdb;
> use tdb;
> create table t1(id int);
> create table t2(id int);
> explain insert into `tdb.t1` select * from t2;
> {noformat}
> {noformat}
> +---+
> |  
> Explain  |
> +---+
> | STAGE DEPENDENCIES: 
>   |
> |   Stage-1 is a root stage   
>   |
> |   Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, 
> Stage-4  |
> |   Stage-3   
>   |
> |   Stage-0 depends on stages: Stage-3, Stage-2, Stage-5  
>   |
> |   Stage-2   
>   |
> |   Stage-4   
>   |
> |   Stage-5 depends on stages: Stage-4
>   |
> | 
>   |
> | STAGE PLANS:
>   |
> |   Stage: Stage-1
>   |
> | Map Reduce  
>   |
> |   Map Operator Tree:
>   |
> |   TableScan 
>   |
> | alias: t2   
>   |
> | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE |
> | Select Operator 
>   |
> |   expressions: id (type: int)   
>   |
> |   outputColumnNames: _col0  
>   |
> |   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
> stats: NONE   |
> |   File Output Operator  
>   |
> | 

[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Open  (was: Patch Available)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Status: Patch Available  (was: Open)

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16996:
---
Attachment: HIVE-16966.06.patch

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: Accuracy and performance comparison between HyperLogLog 
> and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, 
> HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch, 
> HIVE-16966.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17084) Turn on hive.stats.fetch.column.stats configuration flag

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086625#comment-16086625
 ] 

Hive QA commented on HIVE-17084:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877159/HIVE-17084.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1173 failed/errored test(s), 10891 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=228)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=228)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_10] 
(batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] 
(batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_12] 
(batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alias_casted_column] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ambiguous_col] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_date] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ansi_sql_arithmetic] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_6] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_8] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join0] (batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join10] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join11] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join12] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join13] (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join14] (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join15] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join16] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join17] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18] (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19_inclause] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join1] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join20] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join21] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join22] (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join23] (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join26] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join27] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join28] (batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join29] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join2] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join31] (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join32] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join33] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join3] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join4] (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join5] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join6] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join9] (batchId=72)

[jira] [Commented] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-13 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086622#comment-16086622
 ] 

Pengcheng Xiong commented on HIVE-15758:


[~vgarg], could u explain briefly how hive matches the rows between inner and 
outer queries? thanks.

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16926) LlapTaskUmbilicalExternalClient should not start new umbilical server for every fragment request

2017-07-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086607#comment-16086607
 ] 

Siddharth Seth commented on HIVE-16926:
---

+1. Looks good.
There's a bunch of unused imports which I forgot to mention. Would be nice to 
remove those before commit.

> LlapTaskUmbilicalExternalClient should not start new umbilical server for 
> every fragment request
> 
>
> Key: HIVE-16926
> URL: https://issues.apache.org/jira/browse/HIVE-16926
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16926.1.patch, HIVE-16926.2.patch, 
> HIVE-16926.3.patch, HIVE-16926.4.patch, HIVE-16926.5.patch
>
>
> Followup task from [~sseth] and [~sershe] after HIVE-16777.
> LlapTaskUmbilicalExternalClient currently creates a new umbilical server for 
> every fragment request, but this is not necessary and the umbilical can be 
> shared.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086603#comment-16086603
 ] 

Aihua Xu commented on HIVE-17088:
-

Thanks [~spena]

+1

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-17088.1.patch
>
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086585#comment-16086585
 ] 

Vineet Garg commented on HIVE-15758:


[~pxiong] It turns out Hive can already plan/execute such queries. I have 
uploaded a patch with error check disabled and bunch of new tests.

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15758:
---
Status: Patch Available  (was: Open)

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) run ptest with acid 2.0 the default

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Attachment: HIVE-17089.01.patch

> run ptest with acid 2.0 the default
> ---
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) run ptest with acid 2.0 the default

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Status: Patch Available  (was: Open)

> run ptest with acid 2.0 the default
> ---
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15758) Allow correlated scalar subqueries with aggregates which has non-equi join predicates

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15758:
---
Attachment: HIVE-15758.1.patch

> Allow correlated scalar subqueries with aggregates which has non-equi join 
> predicates
> -
>
> Key: HIVE-15758
> URL: https://issues.apache.org/jira/browse/HIVE-15758
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15758.1.patch
>
>
> Queries such as 
> {code} select * from part where p_size <> (select count(p_size) from part pp 
> where part.p_type <> pp.p_type); {code} are currently not allowed since HIVE 
> doesn't know how to rewrite such queries to preserve the correctness for 
> cases when there is zero row



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) run ptest with acid 2.0 the default

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Summary: run ptest with acid 2.0 the default  (was: make acid 2.0 the 
default)

> run ptest with acid 2.0 the default
> ---
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15898) add Type2 SCD merge tests

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15898:
--
Attachment: HIVE-15898.11.patch

> add Type2 SCD merge tests
> -
>
> Key: HIVE-15898
> URL: https://issues.apache.org/jira/browse/HIVE-15898
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, 
> HIVE-15898.03.patch, HIVE-15898.04.patch, HIVE-15898.05.patch, 
> HIVE-15898.06.patch, HIVE-15898.07.patch, HIVE-15898.08.patch, 
> HIVE-15898.09.patch, HIVE-15898.10.patch, HIVE-15898.11.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15898) add Type2 SCD merge tests

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086569#comment-16086569
 ] 

Hive QA commented on HIVE-15898:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877148/HIVE-15898.10.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10892 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_type2_scd]
 (batchId=144)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6017/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6017/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6017/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877148 - PreCommit-HIVE-Build

> add Type2 SCD merge tests
> -
>
> Key: HIVE-15898
> URL: https://issues.apache.org/jira/browse/HIVE-15898
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, 
> HIVE-15898.03.patch, HIVE-15898.04.patch, HIVE-15898.05.patch, 
> HIVE-15898.06.patch, HIVE-15898.07.patch, HIVE-15898.08.patch, 
> HIVE-15898.09.patch, HIVE-15898.10.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: HIVE-17086.2.patch

Renamed metrics name

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch, HIVE-17086.2.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15705) Event replication for constraints

2017-07-13 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086555#comment-16086555
 ] 

Daniel Dai commented on HIVE-15705:
---

[~sankarh], [~anishek], can you give a review?

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch
>
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-12425) OrcRecordUpdater.close(true) leaves the file open

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-12425.
---
   Resolution: Not A Problem
Fix Version/s: 3.0.0

this is fixed in current master (3.0)

> OrcRecordUpdater.close(true) leaves the file open
> -
>
> Key: HIVE-12425
> URL: https://issues.apache.org/jira/browse/HIVE-12425
> Project: Hive
>  Issue Type: Bug
>  Components: ORC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.0.0
>
>
> {noformat}
> public void close(boolean abort) throws IOException {
> if (abort) {
>   if (flushLengths == null) {
> fs.delete(path, false);
>   }
> } else {
>   if (writer != null) writer.close();
> }
> if (flushLengths != null) {
>   flushLengths.close();
>   fs.delete(getSideFile(path), false);
> }
> writer = null;
>   }
> {noformat}
> While the assumption is that the last txn writing to this file to commit 
> would have called flush(), this still leaves the file open.
> cc [~owen.omalley]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17085) ORC file merge/concatenation should do full schema check

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17085:
-
Status: Patch Available  (was: Open)

> ORC file merge/concatenation should do full schema check
> 
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0, 2.3.0, 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17085.1.patch
>
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
>   ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of 
> index entries found: 0 expected: 1
>   at 
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
>   at 
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
>   at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
>   at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
>   ... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently 
> checks only file version match).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17085) ORC file merge/concatenation should do full schema check

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17085:
-
Attachment: HIVE-17085.1.patch

[~gopalv] can you please take a look?

> ORC file merge/concatenation should do full schema check
> 
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0, 2.3.0, 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17085.1.patch
>
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
>   ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of 
> index entries found: 0 expected: 1
>   at 
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
>   at 
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
>   at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
>   at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
>   ... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently 
> checks only file version match).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086531#comment-16086531
 ] 

Sergio Peña commented on HIVE-17088:


[~aihuaxu] could you help me review this?

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-17088.1.patch
>
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086529#comment-16086529
 ] 

Sergio Peña commented on HIVE-17088:


Here's how to check that the WebUI is working:

Use the TestMiniHS2#testConfInSession method to run a test. First, modify the 
code to enable the webui, then use a loop so that the test stops there:
{noformat}
hiveConf.setBoolVar(ConfVars.HIVE_IN_TEST, false);
miniHS2 = new MiniHS2(hiveConf);
miniHS2.start(new HashMap());

while (true) {}
{noformat}

Then call the test. Ater a few seconds, go the your browser, and type 
"localhost:10002". You should see the WebUI working.

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-17088.1.patch
>
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17089) make acid 2.0 the default

2017-07-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17089:
-


> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-17088:
---
Attachment: HIVE-17088.1.patch

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-17088.1.patch
>
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-17088:
---
Status: Patch Available  (was: Open)

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-17088.1.patch
>
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17071) Make hive 2.3 depend on storage-api-2.4

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17071:
---
Summary: Make hive 2.3 depend on storage-api-2.4  (was: Make hive 2.3 
depend on storage-api-2.3)

> Make hive 2.3 depend on storage-api-2.4
> ---
>
> Key: HIVE-17071
> URL: https://issues.apache.org/jira/browse/HIVE-17071
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
> Attachments: HIVE-17071-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17071) Make hive 2.3 depend on storage-api-2.4

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17071:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Make hive 2.3 depend on storage-api-2.4
> ---
>
> Key: HIVE-17071
> URL: https://issues.apache.org/jira/browse/HIVE-17071
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.3.0
>
> Attachments: HIVE-17071-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-07-13 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17005) Ensure REPL DUMP and REPL LOAD are authorized properly

2017-07-13 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-17005:

Attachment: HIVE-17005.2.patch

Updated patch to address whitespace issues.

And the tests were also failing due to whitespace issues resulting in differing 
.q.out, so fixed that.

> Ensure REPL DUMP and REPL LOAD are authorized properly
> --
>
> Key: HIVE-17005
> URL: https://issues.apache.org/jira/browse/HIVE-17005
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-17005.2.patch, HIVE-17005.patch
>
>
> Currently, we piggyback REPL DUMP and REPL LOAD on EXPORT and IMPORT auth 
> privileges. However, work is on to not populate all the relevant objects in 
> inputObjs and outputObjs, which then requires that REPL DUMP and REPL LOAD be 
> authorized at a higher level, and simply require ADMIN_PRIV to run,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086487#comment-16086487
 ] 

Gopal V commented on HIVE-17086:


+1 - Recommend renaming to "Limit" and "Max" instead of "Max" and "MaxSoFar".

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17088) HS2 WebUI throws a NullPointerException when opened

2017-07-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-17088:
--

Assignee: Sergio Peña

> HS2 WebUI throws a NullPointerException when opened
> ---
>
> Key: HIVE-17088
> URL: https://issues.apache.org/jira/browse/HIVE-17088
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> After bumping the Jetty version to 3.9 and excluding several other 
> dependencies on HIVE-16049, the HS2 webui stopped working and throwing a NPE 
> error.
> {noformat}
> HTTP ERROR 500
> Problem accessing /hiveserver2.jsp. Reason:
> Server Error
> Caused by:
> java.lang.NullPointerException
>   at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:181)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>   at org.eclipse.jetty.server.Server.handle(Server.java:534)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:240)
>   at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>   at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>   at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>   at java.lang.Thread.run(Thread.java:748)
> Powered by Jetty:// 9.3.19.v20170502
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17072) Make the parallelized timeout configurable in BeeLine tests

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086478#comment-16086478
 ] 

Hive QA commented on HIVE-17072:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877124/HIVE-17072.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10891 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY 
(batchId=180)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6016/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6016/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6016/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877124 - PreCommit-HIVE-Build

> Make the parallelized timeout configurable in BeeLine tests
> ---
>
> Key: HIVE-17072
> URL: https://issues.apache.org/jira/browse/HIVE-17072
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Minor
> Attachments: HIVE-17072.1.patch, HIVE-17072.2.patch
>
>
> When running the BeeLine tests parallel, the timeout is hardcoded in the 
> Parallelized.java:
> {noformat}
> @Override
> public void finished() {
>   executor.shutdown();
>   try {
> executor.awaitTermination(10, TimeUnit.MINUTES);
>   } catch (InterruptedException exc) {
> throw new RuntimeException(exc);
>   }
> }
> {noformat}
> It would be better to make it configurable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14988) Support INSERT OVERWRITE into a partition on transactional tables

2017-07-13 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086448#comment-16086448
 ] 

Eugene Koifman commented on HIVE-14988:
---

+1 LGTM

FYI, it may be easier to check for files names like this using 
INPUT__FILE__NAME VirtualColumn, for example 
TestTxnCommands.testNonAcidToAcidConversion01()

It may be good to move the 2 new tests in TestTxnCommands2 to TestTxnCommands.  
The former has 2 subclasses which rerun all the tests in Split Update mode and 
vectorized mode.  I don't think there is value in that for these tests - just 
adds to runtime.  

This can be done in later.



> Support INSERT OVERWRITE into a partition on transactional tables
> -
>
> Key: HIVE-14988
> URL: https://issues.apache.org/jira/browse/HIVE-14988
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-14988.01.patch, HIVE-14988.02.patch, 
> HIVE-14988.03.patch, HIVE-14988.04.patch, HIVE-14988.05.patch, 
> HIVE-14988.06.patch
>
>
> Insert overwrite operation on transactional table will currently raise an 
> error.
> This can/should be supported



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17085) ORC file merge/concatenation should do full schema check

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17085:
-
Description: 
ORC merging/concatenation compatibility check just looks for column count match 
at outer level. ORC schema evolution now supports inner structs as well. With 
that outer level column count will match but inner column level will not match. 
Compatibility check should do full schema match before merging/concatenation. 
This issue will not cause data loss but will cause task failures with exception 
like below
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
OrcFileMergeOperator
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
at 
org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
... 16 more
Caused by: java.lang.IllegalArgumentException: Column has wrong number of index 
entries found: 0 expected: 1
at 
org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
at 
org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
... 19 more
{code}

Concatenation should also make sure writer version is matching (it currently 
checks only file version match).

  was:
ORC merging/concatenation compatibility check just looks for column count match 
at outer level. ORC schema evolution now supports inner structs as well. With 
that outer level column count will match but inner column level will not match. 
Compatibility check should do full schema match before merging/concatenation. 
This issue will not cause data loss but will cause task failures with exception 
like below
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
OrcFileMergeOperator
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
at 
org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
... 16 more
Caused by: java.lang.IllegalArgumentException: Column has wrong number of index 
entries found: 0 expected: 1
at 
org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
at 
org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
... 19 more
{code}


> ORC file merge/concatenation should do full schema check
> 
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0, 2.3.0, 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> 

[jira] [Assigned] (HIVE-17087) HoS Query with multiple Partition Pruning Sinks + subquery has incorrect explain

2017-07-13 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17087:
---


> HoS Query with multiple Partition Pruning Sinks + subquery has incorrect 
> explain
> 
>
> Key: HIVE-17087
> URL: https://issues.apache.org/jira/browse/HIVE-17087
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Ran the following query in the {{TestSparkCliDriver}}:
> {code:sql}
> set hive.spark.dynamic.partition.pruning=true;
> set hive.auto.convert.join=true;
> create table partitioned_table1 (col int) partitioned by (part_col int);
> create table partitioned_table2 (col int) partitioned by (part_col int);
> create table regular_table (col int);
> insert into table regular_table values (1);
> alter table partitioned_table1 add partition (part_col = 1);
> insert into table partitioned_table1 partition (part_col = 1) values (1), 
> (2), (3), (4), (5), (6), (7), (8), (9), (10);
> alter table partitioned_table2 add partition (part_col = 1);
> insert into table partitioned_table2 partition (part_col = 1) values (1), 
> (2), (3), (4), (5), (6), (7), (8), (9), (10);
> explain select * from partitioned_table1 where partitioned_table1.part_col in 
> (select regular_table.col from regular_table join partitioned_table2 on 
> regular_table.col = partitioned_table2.part_col);
> {code}
> and got the following explain plan:
> {code}
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-4 depends on stages: Stage-2
>   Stage-5 depends on stages: Stage-4
>   Stage-3 depends on stages: Stage-5
>   Stage-1 depends on stages: Stage-3
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 4 
> Map Operator Tree:
> TableScan
>   alias: partitioned_table1
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: col (type: int), part_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   expressions: _col1 (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> keys: _col0 (type: int)
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
> Spark Partition Pruning Sink Operator
>   partition key expr: part_col
>   Statistics: Num rows: 10 Data size: 11 Basic stats: 
> COMPLETE Column stats: NONE
>   target column name: part_col
>   target work: Map 3
>   Stage: Stage-4
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: regular_table
>   Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE 
> Column stats: NONE
>   Filter Operator
> predicate: col is not null (type: boolean)
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   expressions: col (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark HashTable Sink Operator
> keys:
>   0 _col0 (type: int)
>   1 _col0 (type: int)
>   Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Spark Partition Pruning Sink Operator
> partition key expr: part_col
>

[jira] [Commented] (HIVE-16973) Fetching of Delegation tokens (Kerberos) for AccumuloStorageHandler fails in HS2

2017-07-13 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086439#comment-16086439
 ] 

Josh Elser commented on HIVE-16973:
---

After HIVE-17083, my local testing setup works (again) as expected. We should 
land HIVE-17083 and then this one.

> Fetching of Delegation tokens (Kerberos) for AccumuloStorageHandler fails in 
> HS2
> 
>
> Key: HIVE-16973
> URL: https://issues.apache.org/jira/browse/HIVE-16973
> Project: Hive
>  Issue Type: Bug
>  Components: Accumulo Storage Handler
>Reporter: Josh Elser
>Assignee: Josh Elser
> Attachments: HIVE-16973.001.patch, HIVE-16973.002.branch-2.patch, 
> HIVE-16973.003.branch-2.patch, HIVE-16973.004-branch-2.patch, 
> HIVE-16973.004.branch-2.patch, HIVE-16973.004-master.patch
>
>
> Had a report from a user that Kerberos+AccumuloStorageHandler+HS2 was broken. 
> Looking into it, it seems like the bit-rot got pretty bad. You'll see 
> something like the following:
> {noformat}
> Caused by: java.io.IOException: Failed to unwrap AuthenticationToken 
> at 
> org.apache.hadoop.hive.accumulo.HiveAccumuloHelper.unwrapAuthenticationToken(HiveAccumuloHelper.java:312)
>  
> at 
> org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableInputFormat.getSplits(HiveAccumuloTableInputFormat.java:122)
>  
> {noformat}
> It appears that some of the code-paths changed since when I first did my 
> testing (or I just did poor testing) and the delegation token was never being 
> fetched/serialized. There also are some issues with fetching the delegation 
> token from Accumulo properly which were addressed in ACCUMULO-4665
> I believe it would also be best to just update the dependency to use Accumulo 
> 1.7 (drop 1.6 support) as it's lacking in this regard. These changes would 
> otherwise get much more complicated with reflection -- Accumulo has moved on 
> past 1.6, so let's do the same in Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17083) DagUtils overwrites any credentials already added

2017-07-13 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HIVE-17083:
--
Status: Patch Available  (was: Open)

> DagUtils overwrites any credentials already added
> -
>
> Key: HIVE-17083
> URL: https://issues.apache.org/jira/browse/HIVE-17083
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Josh Elser
>Assignee: Josh Elser
> Attachments: HIVE-17083.patch
>
>
> While working with a StorageHandler with hive.execution.engine=tez, I found 
> that the credentials the storage handler was adding were not propagating to 
> the dag.
> After a big of debugging/git-log, I found that DagUtils was overwriting the 
> credentials which were already set. A quick patch locally seem to make things 
> work again. Will put together a quick unit test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17083) DagUtils overwrites any credentials already added

2017-07-13 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HIVE-17083:
--
Attachment: HIVE-17083.patch

Seems like this little bug was the cause of my heartburn. Trivial fix and added 
a unit test for the future.

> DagUtils overwrites any credentials already added
> -
>
> Key: HIVE-17083
> URL: https://issues.apache.org/jira/browse/HIVE-17083
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Josh Elser
>Assignee: Josh Elser
> Attachments: HIVE-17083.patch
>
>
> While working with a StorageHandler with hive.execution.engine=tez, I found 
> that the credentials the storage handler was adding were not propagating to 
> the dag.
> After a big of debugging/git-log, I found that DagUtils was overwriting the 
> credentials which were already set. A quick patch locally seem to make things 
> work again. Will put together a quick unit test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086405#comment-16086405
 ] 

Prasanth Jayachandran commented on HIVE-17086:
--

[~gopalv] can you please take a look? small patch

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: HIVE-17086.1.patch

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: (was: HIVE-17086.1.patch)

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: (was: HIVE-17086.1.patch)

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: HIVE-17086.1.patch

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Attachment: HIVE-17086.1.patch

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17086:
-
Status: Patch Available  (was: Open)

> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17086.1.patch
>
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16973) Fetching of Delegation tokens (Kerberos) for AccumuloStorageHandler fails in HS2

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086402#comment-16086402
 ] 

Hive QA commented on HIVE-16973:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877126/HIVE-16973.004-master.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6015/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6015/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6015/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-hdfs/2.8.0/hadoop-hdfs-2.8.0.jar(org/apache/hadoop/hdfs/web/AuthFilter.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/shims/common/target/hive-shims-common-3.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/shims/Utils.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.8.0/hadoop-common-2.8.0.jar(org/apache/hadoop/security/UserGroupInformation.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-auth/2.8.0/hadoop-auth-2.8.0.jar(org/apache/hadoop/security/authentication/client/PseudoAuthenticator.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-auth/2.8.0/hadoop-auth-2.8.0.jar(org/apache/hadoop/security/authentication/server/PseudoAuthenticationHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.8.0/hadoop-common-2.8.0.jar(org/apache/hadoop/util/GenericOptionsParser.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-rewrite/9.3.8.v20160314/jetty-rewrite-9.3.8.v20160314.jar(org/eclipse/jetty/rewrite/handler/RedirectPatternRule.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-rewrite/9.3.8.v20160314/jetty-rewrite-9.3.8.v20160314.jar(org/eclipse/jetty/rewrite/handler/RewriteHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/Handler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/Server.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/ServerConnector.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-server/9.3.8.v20160314/jetty-server-9.3.8.v20160314.jar(org/eclipse/jetty/server/handler/HandlerList.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/FilterHolder.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/ServletContextHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.8.v20160314/jetty-servlet-9.3.8.v20160314.jar(org/eclipse/jetty/servlet/ServletHolder.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-xml/9.3.8.v20160314/jetty-xml-9.3.8.v20160314.jar(org/eclipse/jetty/xml/XmlConfiguration.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar(org/slf4j/bridge/SLF4JBridgeHandler.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/DispatcherType.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar(javax/servlet/http/HttpServletRequest.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-3.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceAudience$LimitedPrivate.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-3.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability$Unstable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/ByteArrayOutputStream.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/OutputStream.class)]]
[loading 

[jira] [Assigned] (HIVE-17086) LLAP: JMX Metric for max file descriptors used so far

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17086:



> LLAP: JMX Metric for max file descriptors used so far
> -
>
> Key: HIVE-17086
> URL: https://issues.apache.org/jira/browse/HIVE-17086
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> LlapDaemonMaxFileDescriptorCount shows max file descriptors that system will 
> allow. For debugging purpose we could also store the max value that was seen 
> so far to know if we have hit the limit. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17085) ORC file merge/concatenation should do full schema check

2017-07-13 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17085:



> ORC file merge/concatenation should do full schema check
> 
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0, 2.3.0, 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
>   ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of 
> index entries found: 0 expected: 1
>   at 
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
>   at 
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
>   at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
>   at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
>   at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
>   ... 19 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables

2017-07-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086392#comment-16086392
 ] 

Hive QA commented on HIVE-12631:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12877122/HIVE-12631.21.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10894 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] 
(batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=151)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning 
(batchId=289)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6014/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6014/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6014/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12877122 - PreCommit-HIVE-Build

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, 
> HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, 
> HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, 
> HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, 
> HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, 
> HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17084) Turn on hive.stats.fetch.column.stats configuration flag

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17084:
---
Status: Patch Available  (was: Open)

> Turn on hive.stats.fetch.column.stats configuration flag
> 
>
> Key: HIVE-17084
> URL: https://issues.apache.org/jira/browse/HIVE-17084
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17084.1.patch
>
>
> This flag is off by default and could result in bad plans due to missing 
> column statistics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17084) Turn on hive.stats.fetch.column.stats configuration flag

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17084:
---
Attachment: HIVE-17084.1.patch

> Turn on hive.stats.fetch.column.stats configuration flag
> 
>
> Key: HIVE-17084
> URL: https://issues.apache.org/jira/browse/HIVE-17084
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17084.1.patch
>
>
> This flag is off by default and could result in bad plans due to missing 
> column statistics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17084) Turn on hive.stats.fetch.column.stats configuration flag

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17084:
--


> Turn on hive.stats.fetch.column.stats configuration flag
> 
>
> Key: HIVE-17084
> URL: https://issues.apache.org/jira/browse/HIVE-17084
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
>
> This flag is off by default and could result in bad plans due to missing 
> column statistics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant

2017-07-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086379#comment-16086379
 ] 

Vineet Garg commented on HIVE-16793:


Documented this new configuration under [Configuration Wiki | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.remove.sq_count_check]
 cc [~leftylev]

> Scalar sub-query: sq_count_check not required if gby keys are constant
> --
>
> Key: HIVE-16793
> URL: https://issues.apache.org/jira/browse/HIVE-16793
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, 
> HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch
>
>
> This query has an sq_count check, though is useless on a constant key.
> {code}
> hive> explain select * from part where p_size > (select max(p_size) from part 
> where p_type = '1' group by p_type);
> Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product
> Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE)
> Reducer 3 <- Map 2 (SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE)
> Reducer 6 <- Map 5 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 vectorized, llap
>   File Output Operator [FS_64]
> Select Operator [SEL_63] (rows= width=621)
>   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   Filter Operator [FIL_62] (rows= width=625)
> predicate:(_col5 > _col10)
> Map Join Operator [MAPJOIN_61] (rows=2 width=625)
>   
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"]
> <-Reducer 6 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_58]
> Select Operator [SEL_57] (rows=1 width=4)
>   Output:["_col0"]
>   Group By Operator [GBY_56] (rows=1 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0
>   <-Map 5 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_55]
>   PartitionCols:_col0
>   Group By Operator [GBY_54] (rows=86 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1'
> Select Operator [SEL_53] (rows=1212121 width=109)
>   Output:["_col1"]
>   Filter Operator [FIL_52] (rows=1212121 width=109)
> predicate:(p_type = '1')
> TableScan [TS_17] (rows=2 width=109)
>   
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"]
> <-Map Join Operator [MAPJOIN_60] (rows=2 width=621)
> 
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   <-Reducer 4 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_51]
>   Select Operator [SEL_50] (rows=1 width=8)
> Filter Operator [FIL_49] (rows=1 width=8)
>   predicate:(sq_count_check(_col0) <= 1)
>   Group By Operator [GBY_48] (rows=1 width=8)
> Output:["_col0"],aggregations:["count(VALUE._col0)"]
>   <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
> PARTITION_ONLY_SHUFFLE [RS_47]
>   Group By Operator [GBY_46] (rows=1 width=8)
> Output:["_col0"],aggregations:["count()"]
> Select Operator [SEL_45] (rows=1 width=85)
>   Group By Operator [GBY_44] (rows=1 width=85)
> Output:["_col0"],keys:KEY._col0
>   <-Map 2 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_43]
>   PartitionCols:_col0
>   Group By Operator [GBY_42] (rows=83 
> width=85)
> Output:["_col0"],keys:'1'
> Select Operator [SEL_41] (rows=1212121 
> width=105)
>   Filter Operator [FIL_40] (rows=1212121 
> width=105)
>   

[jira] [Commented] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant

2017-07-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086361#comment-16086361
 ] 

Vineet Garg commented on HIVE-16793:


Pushed to master. Thanks Ashutosh!

> Scalar sub-query: sq_count_check not required if gby keys are constant
> --
>
> Key: HIVE-16793
> URL: https://issues.apache.org/jira/browse/HIVE-16793
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, 
> HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch
>
>
> This query has an sq_count check, though is useless on a constant key.
> {code}
> hive> explain select * from part where p_size > (select max(p_size) from part 
> where p_type = '1' group by p_type);
> Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product
> Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE)
> Reducer 3 <- Map 2 (SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE)
> Reducer 6 <- Map 5 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 vectorized, llap
>   File Output Operator [FS_64]
> Select Operator [SEL_63] (rows= width=621)
>   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   Filter Operator [FIL_62] (rows= width=625)
> predicate:(_col5 > _col10)
> Map Join Operator [MAPJOIN_61] (rows=2 width=625)
>   
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"]
> <-Reducer 6 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_58]
> Select Operator [SEL_57] (rows=1 width=4)
>   Output:["_col0"]
>   Group By Operator [GBY_56] (rows=1 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0
>   <-Map 5 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_55]
>   PartitionCols:_col0
>   Group By Operator [GBY_54] (rows=86 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1'
> Select Operator [SEL_53] (rows=1212121 width=109)
>   Output:["_col1"]
>   Filter Operator [FIL_52] (rows=1212121 width=109)
> predicate:(p_type = '1')
> TableScan [TS_17] (rows=2 width=109)
>   
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"]
> <-Map Join Operator [MAPJOIN_60] (rows=2 width=621)
> 
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   <-Reducer 4 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_51]
>   Select Operator [SEL_50] (rows=1 width=8)
> Filter Operator [FIL_49] (rows=1 width=8)
>   predicate:(sq_count_check(_col0) <= 1)
>   Group By Operator [GBY_48] (rows=1 width=8)
> Output:["_col0"],aggregations:["count(VALUE._col0)"]
>   <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
> PARTITION_ONLY_SHUFFLE [RS_47]
>   Group By Operator [GBY_46] (rows=1 width=8)
> Output:["_col0"],aggregations:["count()"]
> Select Operator [SEL_45] (rows=1 width=85)
>   Group By Operator [GBY_44] (rows=1 width=85)
> Output:["_col0"],keys:KEY._col0
>   <-Map 2 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_43]
>   PartitionCols:_col0
>   Group By Operator [GBY_42] (rows=83 
> width=85)
> Output:["_col0"],keys:'1'
> Select Operator [SEL_41] (rows=1212121 
> width=105)
>   Filter Operator [FIL_40] (rows=1212121 
> width=105)
> predicate:(p_type = '1')
> TableScan [TS_2] (rows=2 
> width=105)
>   
> 

[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant

2017-07-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16793:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Scalar sub-query: sq_count_check not required if gby keys are constant
> --
>
> Key: HIVE-16793
> URL: https://issues.apache.org/jira/browse/HIVE-16793
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, 
> HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch
>
>
> This query has an sq_count check, though is useless on a constant key.
> {code}
> hive> explain select * from part where p_size > (select max(p_size) from part 
> where p_type = '1' group by p_type);
> Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product
> Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE)
> Reducer 3 <- Map 2 (SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE)
> Reducer 6 <- Map 5 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Map 1 vectorized, llap
>   File Output Operator [FS_64]
> Select Operator [SEL_63] (rows= width=621)
>   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   Filter Operator [FIL_62] (rows= width=625)
> predicate:(_col5 > _col10)
> Map Join Operator [MAPJOIN_61] (rows=2 width=625)
>   
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"]
> <-Reducer 6 [BROADCAST_EDGE] vectorized, llap
>   BROADCAST [RS_58]
> Select Operator [SEL_57] (rows=1 width=4)
>   Output:["_col0"]
>   Group By Operator [GBY_56] (rows=1 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0
>   <-Map 5 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_55]
>   PartitionCols:_col0
>   Group By Operator [GBY_54] (rows=86 width=89)
> 
> Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1'
> Select Operator [SEL_53] (rows=1212121 width=109)
>   Output:["_col1"]
>   Filter Operator [FIL_52] (rows=1212121 width=109)
> predicate:(p_type = '1')
> TableScan [TS_17] (rows=2 width=109)
>   
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"]
> <-Map Join Operator [MAPJOIN_60] (rows=2 width=621)
> 
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>   <-Reducer 4 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_51]
>   Select Operator [SEL_50] (rows=1 width=8)
> Filter Operator [FIL_49] (rows=1 width=8)
>   predicate:(sq_count_check(_col0) <= 1)
>   Group By Operator [GBY_48] (rows=1 width=8)
> Output:["_col0"],aggregations:["count(VALUE._col0)"]
>   <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
> PARTITION_ONLY_SHUFFLE [RS_47]
>   Group By Operator [GBY_46] (rows=1 width=8)
> Output:["_col0"],aggregations:["count()"]
> Select Operator [SEL_45] (rows=1 width=85)
>   Group By Operator [GBY_44] (rows=1 width=85)
> Output:["_col0"],keys:KEY._col0
>   <-Map 2 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_43]
>   PartitionCols:_col0
>   Group By Operator [GBY_42] (rows=83 
> width=85)
> Output:["_col0"],keys:'1'
> Select Operator [SEL_41] (rows=1212121 
> width=105)
>   Filter Operator [FIL_40] (rows=1212121 
> width=105)
> predicate:(p_type = '1')
> TableScan [TS_2] (rows=2 
> width=105)
>

  1   2   >