[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-31 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451321#comment-15451321
 ] 

Lefty Leverenz commented on HIVE-14362:
---

Doc note:  EXPLAIN ANALYZE needs to be documented in the wiki for release 2.2.0.

* [LanguageManual -- Explain | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain]

Added a TODOC2.2 label.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, HIVE-14362.05.patch, HIVE-14362.06.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-30 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450103#comment-15450103
 ] 

Pengcheng Xiong commented on HIVE-14362:


pushed to master. Thanks [~ashutoshc], [~gopalv] and [~gszadovszky] for the 
reviews! [~gopalv], i will open another jira to support abort stats.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, HIVE-14362.05.patch, HIVE-14362.06.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450022#comment-15450022
 ] 

Hive QA commented on HIVE-14362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12826218/HIVE-14362.06.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10472 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1044/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1044/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1044/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12826218 - PreCommit-HIVE-MASTER-Build

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, HIVE-14362.05.patch, HIVE-14362.06.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-29 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15447468#comment-15447468
 ] 

Ashutosh Chauhan commented on HIVE-14362:
-

+1

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, HIVE-14362.05.patch, compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441398#comment-15441398
 ] 

Hive QA commented on HIVE-14362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825703/HIVE-14362.05.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10470 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_0]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1019/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1019/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1019/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825703 - PreCommit-HIVE-MASTER-Build

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, HIVE-14362.05.patch, compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435150#comment-15435150
 ] 

Pengcheng Xiong commented on HIVE-14362:


Thanks [~gopalv] for the detailed performance analysis. I have addressed the 
local file and also vectorization issue. I still have some other small issue to 
address before i submit another patch. Thanks.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435149#comment-15435149
 ] 

Pengcheng Xiong commented on HIVE-14362:


Thanks [~gopalv] for the detailed performance analysis. I have addressed the 
local file and also vectorization issue. I still have some other small issue to 
address before i submit another patch. Thanks.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-23 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15433811#comment-15433811
 ] 

Gopal V commented on HIVE-14362:


bq. I assume that your major concern is performance difference, rather than 
functional, right?

[~pxiong]: Ran through my perf tests last night and this patch is nearly free - 
because there's no branch or virtual calls, the incq instruction doesn't have a 
CPU stall associated with it and is pretty much running with no additional perf 
impact (nothing measurable). No perf concerns for this impl - the closeOp() is 
not a hot function, so it doesn't matter unless a user is explicitly running 
"explain analyze".

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432092#comment-15432092
 ] 

Pengcheng Xiong commented on HIVE-14362:


Hi [~gopalv], it works for q tests see explain analyze 1-5. And i really 
tested it on the cluster and it worked for some simple queries. Thanks for 
finding this out and I agree with you. I think we really need to change that to 
HDFS temp folder somewhere. We can improve that anyway. Btw, I assume that your 
major concern is performance difference, rather than functional, right? Thanks. 
:)

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-22 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432091#comment-15432091
 ] 

Pengcheng Xiong commented on HIVE-14362:


Hi [~gopalv], it works for q tests see explain analyze 1-5. And i really 
tested it on the cluster and it worked for some simple queries. Thanks for 
finding this out and I agree with you. I think we really need to change that to 
HDFS temp folder somewhere. We can improve that anyway. Btw, I assume that your 
major concern is performance difference, rather than functional, right? Thanks. 
:)

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432018#comment-15432018
 ] 

Gopal V commented on HIVE-14362:


The local path was traced down to -  
config.setExplainRootPath(ctx.getLocalTmpPath()); in SemanticAnalyzer.

The path for collecting stats has to be in the Hive Session dir on HDFS, I'll 
try to patch this tomorrow and try running again.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431939#comment-15431939
 ] 

Gopal V commented on HIVE-14362:


[~pxiong]: tested this patch - running explain analyze seems to disable 
vectorization for all queries after that point.

{code}
+  HiveConf.setBoolVar(conf, HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, 
false);
{code}

And explain analyze does not actually work.

{code}
2016-08-23T01:13:10,961  INFO [667a4e5f-6194-438f-85d6-339aca3ebecc main] 
physical.AnnotateRunTimeStatsOptimizer: setRuntimeStatsDir for RS_8
2016-08-23T01:13:10,962  INFO [667a4e5f-6194-438f-85d6-339aca3ebecc main] 
fs.FSStatsPublisher: created : 
file:/tmp/gopal/667a4e5f-6194-438f-85d6-339aca3ebecc/hive_2016-08-23_01-13-10_705_7555853843090786759-1/-local-1/RS_8
{code}

The paths for output are in local dirs, not the HDFS dirs - so the stats 
written on a machine are not making their way back to the HiveServer2 box.

{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: 
StatsPublisher cannot be connected to.There was a error while connecting to the 
StatsPublisher, and retrying might help. If you dont want the query to fail 
because accurate statistics could not be collected, set 
hive.stats.reliable=false
at 
org.apache.hadoop.hive.ql.exec.Operator.publishRunTimeStats(Operator.java:1444)
at org.apache.hadoop.hive.ql.exec.Operator.closeOp(Operator.java:723)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:270)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:691)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:705)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:433)
{code}

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-20 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429347#comment-15429347
 ] 

Hive QA commented on HIVE-14362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12824671/HIVE-14362.02.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 10476 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_quoting]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_tbllvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constant_prop_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_sort_optimization_acid]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exec_parallel_column_stats]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hadoop.hive.ql.exec.TestExplainTask.testExplainDoesSortMapValues
org.apache.hadoop.hive.ql.exec.TestExplainTask.testExplainDoesSortPathAsStrings
org.apache.hadoop.hive.ql.exec.TestExplainTask.testExplainDoesSortTopLevelMapEntries
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/946/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/946/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-946/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12824671 - PreCommit-HIVE-MASTER-Build

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429171#comment-15429171
 ] 

Gopal V commented on HIVE-14362:


Thanks, [~pxiong] for running the benchmarks, I'll add this to Monday's build.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-19 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429128#comment-15429128
 ] 

Pengcheng Xiong commented on HIVE-14362:


[~gopalv], the patch is ready. Could u please let us know if you are satisfied 
with the performance comparison results? Thanks.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419783#comment-15419783
 ] 

Hive QA commented on HIVE-14362:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823510/HIVE-14362.01.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 234 failed/errored test(s), 10471 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join32]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_smb_mapjoin_14]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_4]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_6]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_7]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_4]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_6]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_7]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketizedhiveinputformat_auto]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_6]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_7]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explainanalyze_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explainanalyze_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explainanalyze_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explainanalyze_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explainanalyze_4]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_filters]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_nulls]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_nullsafe]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_join]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_join_partition_key]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_11]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_12]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_14]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_15]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_16]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_17]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_4]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_6]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smblimit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_8]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]

[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-12 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419249#comment-15419249
 ] 

Gopal V commented on HIVE-14362:


[~pxiong]: this approach was abandoned earlier due to known performance issues 
- 
https://issues.apache.org/jira/browse/HIVE-4318?focusedCommentId=13629957=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13629957

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419242#comment-15419242
 ] 

Pengcheng Xiong commented on HIVE-14362:


ccing [~gopalv], will do a performance test soon.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)