[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-05-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: proposal.pdf

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-05-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: (was: proposal.pdf)

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13758) "Create table like" command should initialize the basic stats for the table

2016-05-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13758:
---
Attachment: HIVE-13758.01.patch

> "Create table like" command should initialize the basic stats for the table
> ---
>
> Key: HIVE-13758
> URL: https://issues.apache.org/jira/browse/HIVE-13758
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13758.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13758) "Create table like" command should initialize the basic stats for the table

2016-05-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13758:
---
Status: Patch Available  (was: Open)

> "Create table like" command should initialize the basic stats for the table
> ---
>
> Key: HIVE-13758
> URL: https://issues.apache.org/jira/browse/HIVE-13758
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13758.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13758) "Create table like" command should initialize the basic stats for the table

2016-05-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13758:
---
Status: Open  (was: Patch Available)

> "Create table like" command should initialize the basic stats for the table
> ---
>
> Key: HIVE-13758
> URL: https://issues.apache.org/jira/browse/HIVE-13758
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13758.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13758) "Create table like" command should initialize the basic stats for the table

2016-05-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284160#comment-15284160
 ] 

Pengcheng Xiong commented on HIVE-13758:


resubmit the patch

> "Create table like" command should initialize the basic stats for the table
> ---
>
> Key: HIVE-13758
> URL: https://issues.apache.org/jira/browse/HIVE-13758
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13758.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13758) "Create table like" command should initialize the basic stats for the table

2016-05-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13758:
---
Attachment: (was: HIVE-13758.01.patch)

> "Create table like" command should initialize the basic stats for the table
> ---
>
> Key: HIVE-13758
> URL: https://issues.apache.org/jira/browse/HIVE-13758
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-05-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: (was: proposal.pdf)

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-05-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: proposal.pdf

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Status: Patch Available  (was: Open)

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Attachment: HIVE-12643.3.patch

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Status: Open  (was: Patch Available)

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284128#comment-15284128
 ] 

Hive QA commented on HIVE-13453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12804092/HIVE-13453.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 31 failed/errored test(s), 9198 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-script_pipe.q-vector_decimal_aggregate.q-vector_data_types.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-auto_join30.q-join2.q-input17.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby6_map.q-join13.q-join_reorder3.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-load_dyn_part5.q-load_dyn_part2.q-skewjoinopt16.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_innerjoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input1_limit
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_alt_syntax
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_vc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union33
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle
{noformat}

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/294/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/294/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-294/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 31 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12804092 - PreCommit-HIVE-MASTER-Build

> Support ORDER BY and windowing clause in partitioning clause with distinct 
> function
> ---
>
> Key: HIVE-13453
> URL: https://issues.apache.org/jira/browse/HIVE-13453
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13453.1.patch, HIVE-13453.2.patch, 
> HIVE-13453.3.patch
>
>
> Current distinct function on partitioning doesn't support order by and 
> windowing clause due to performance reason. Explore an efficient way to 
> support that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13354) Add ability to specify Compaction options per table and per request

2016-05-15 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13354:
-
Attachment: HIVE-13354.1.patch

Upload complete patch 1. [~ekoifman] Can you please review?

> Add ability to specify Compaction options per table and per request
> ---
>
> Key: HIVE-13354
> URL: https://issues.apache.org/jira/browse/HIVE-13354
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Attachments: HIVE-13354.1.patch, 
> HIVE-13354.1.withoutSchemaChange.patch
>
>
> Currently the are a few options that determine when automatic compaction is 
> triggered.  They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be 
> compacted more often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is 
> currently no way to control job parameters (like memory, for example) except 
> to specify it in hive-site.xml for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if 
> launched via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on Spark

2016-05-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-13293:
--
Attachment: HIVE-13293.3.patch

> Query occurs performance degradation after enabling parallel order by for 
> Hive on Spark
> ---
>
> Key: HIVE-13293
> URL: https://issues.apache.org/jira/browse/HIVE-13293
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-13293.1.patch, HIVE-13293.2.patch, 
> HIVE-13293.3.patch, HIVE-13293.3.patch, HIVE-13293.3.patch
>
>
> I use TPCx-BB to do some performance test on Hive on Spark engine. And found 
> query 10 has performance degradation when enabling parallel order by.
> It seems that sampling cost much time before running the real query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13602) TPCH q16 return wrong result when CBO is on

2016-05-15 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284045#comment-15284045
 ] 

Nemon Lou commented on HIVE-13602:
--

Thanks [~pxiong] .It will be nice to provide a patch for branch-1, too. If 
there will be a branch-1 release in the future .

> TPCH q16 return wrong result when CBO is on
> ---
>
> Key: HIVE-13602
> URL: https://issues.apache.org/jira/browse/HIVE-13602
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0, 1.2.2
>Reporter: Nemon Lou
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, 
> HIVE-13602.04.patch, HIVE-13602.05.patch, HIVE-13602.final.patch, 
> calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, 
> explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, 
> explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, 
> explain_cbo_good(rewrite)_part3.out
>
>
> Running tpch with factor 2, 
> q16 returns 1,160 rows when CBO is on,
> while returns 24,581 rows when CBO is off.
> See attachment for detail .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13453:

Attachment: HIVE-13453.3.patch

> Support ORDER BY and windowing clause in partitioning clause with distinct 
> function
> ---
>
> Key: HIVE-13453
> URL: https://issues.apache.org/jira/browse/HIVE-13453
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13453.1.patch, HIVE-13453.2.patch, 
> HIVE-13453.3.patch
>
>
> Current distinct function on partitioning doesn't support order by and 
> windowing clause due to performance reason. Explore an efficient way to 
> support that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13453:

Attachment: (was: HIVE-13453.3.patch)

> Support ORDER BY and windowing clause in partitioning clause with distinct 
> function
> ---
>
> Key: HIVE-13453
> URL: https://issues.apache.org/jira/browse/HIVE-13453
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13453.1.patch, HIVE-13453.2.patch, 
> HIVE-13453.3.patch
>
>
> Current distinct function on partitioning doesn't support order by and 
> windowing clause due to performance reason. Explore an efficient way to 
> support that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: HIVE-13149.8.patch

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.1.0
>
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: (was: HIVE-13149.8.patch)

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.1.0
>
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13269 started by Jesus Camacho Rodriguez.
--
> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
Status: Patch Available  (was: In Progress)

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
Status: Open  (was: Patch Available)

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Attachment: HIVE-12643.3.patch

Rebased on master.

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Status: Patch Available  (was: Open)

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12643:

Status: Open  (was: Patch Available)

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on Spark

2016-05-15 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-13293:
--
Attachment: HIVE-13293.3.patch

[~xuefuz] - yeah sampling is needed only if we have more than 1 partitions. So 
we don't need cache in case of single reducer order by.
Not sure why the build failed. Upload same patch to try again.

> Query occurs performance degradation after enabling parallel order by for 
> Hive on Spark
> ---
>
> Key: HIVE-13293
> URL: https://issues.apache.org/jira/browse/HIVE-13293
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Lifeng Wang
>Assignee: Rui Li
> Attachments: HIVE-13293.1.patch, HIVE-13293.2.patch, 
> HIVE-13293.3.patch, HIVE-13293.3.patch
>
>
> I use TPCx-BB to do some performance test on Hive on Spark engine. And found 
> query 10 has performance degradation when enabling parallel order by.
> It seems that sampling cost much time before running the real query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13269 started by Jesus Camacho Rodriguez.
--
> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
Status: Patch Available  (was: In Progress)

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13269:
---
Status: Open  (was: Patch Available)

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-05-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: proposal.pdf

Hi Vaibhav, I've attached a proposal draft with your suggestion of having the 
client send the compressor plugin. I've finished the high-level overview and 
started getting into the implementation details. 

I’m also maintaining a document that we can use as a starting point for 
discussion.
https://github.com/kliewkliew/HIVE-13680/blob/master/design-considerations/design-considerations.pdf

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)