[jira] [Commented] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2016-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076460#comment-15076460
 ] 

Lefty Leverenz commented on HIVE-12075:
---

Doc note:  "ANALYZE TABLE (table-spec) CACHE METADATA" will need to be 
documented in the Statistics wikidoc for release 2.1.0, and 
*hive.metastore.hbase.file.metadata.threads* belongs in the MetaStore section 
of Configuration Properties.

* [Existing Tables – ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables–ANALYZE]
* [Configuration Properties -- MetaStore | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore]

> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, 
> HIVE-12075.02.patch, HIVE-12075.03.patch, HIVE-12075.04.patch, 
> HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12763) Use bit vector to track per partition NDV

2016-01-01 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12763:
---
Attachment: HIVE-12763.01.patch

> Use bit vector to track per partition NDV
> -
>
> Key: HIVE-12763
> URL: https://issues.apache.org/jira/browse/HIVE-12763
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12763.01.patch
>
>
> This will improve merging of per partitions stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12709) further improve user level explain

2016-01-01 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12709:
---
Attachment: HIVE-12709.03.patch

> further improve user level explain
> --
>
> Key: HIVE-12709
> URL: https://issues.apache.org/jira/browse/HIVE-12709
> Project: Hive
>  Issue Type: Sub-task
>  Components: Diagnosability
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 1.2.0
>
> Attachments: HIVE-12709.01.patch, HIVE-12709.02.patch, 
> HIVE-12709.03.patch
>
>
> Need to address more feedbacks from Hive users for the user level explain:
>  (1) Put stats in the same line as operator;
>  (2) Avoid stats on *Sink;
>  (3) Avoid col types;
>  (4) TS should list pruned col names;
>  (5) TS should list fully qualified table name, along with alias; etc 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2016-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12075:
--
Labels: TODOC2.1  (was: )

> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, 
> HIVE-12075.02.patch, HIVE-12075.03.patch, HIVE-12075.04.patch, 
> HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12431) Support timeout for compile lock

2016-01-01 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076463#comment-15076463
 ] 

Lefty Leverenz commented on HIVE-12431:
---

Doc note:  The new configuration parameter *hive.server2.compile.lock.timeout* 
will need to be documented for release 2.1.0 in the HiveServer2 section of 
Configuration Properties.

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

> Support timeout for compile lock
> 
>
> Key: HIVE-12431
> URL: https://issues.apache.org/jira/browse/HIVE-12431
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Query Processor
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Mohit Sabharwal
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12431.1.patch, HIVE-12431.2.patch, 
> HIVE-12431.3.patch, HIVE-12431.3.patch, HIVE-12431.patch
>
>
> To help with HiveServer2 scalability, it would be useful to allow users to 
> configure a timeout value for queries waiting to be compiled. If the timeout 
> value is reached then the query would abort. One option to achieve this would 
> be to update the compile lock to use a try-lock with the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12431) Support timeout for compile lock

2016-01-01 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12431:
--
Labels: TODOC2.1  (was: )

> Support timeout for compile lock
> 
>
> Key: HIVE-12431
> URL: https://issues.apache.org/jira/browse/HIVE-12431
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Query Processor
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Mohit Sabharwal
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12431.1.patch, HIVE-12431.2.patch, 
> HIVE-12431.3.patch, HIVE-12431.3.patch, HIVE-12431.patch
>
>
> To help with HiveServer2 scalability, it would be useful to allow users to 
> configure a timeout value for queries waiting to be compiled. If the timeout 
> value is reached then the query would abort. One option to achieve this would 
> be to update the compile lock to use a try-lock with the timeout value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12625) Backport to branch-1 HIVE-11981 ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2016-01-01 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12625:

Attachment: HIVE-12625.6-branch1.patch

> Backport to branch-1 HIVE-11981 ORC Schema Evolution Issues (Vectorized, 
> ACID, and Non-Vectorized)
> --
>
> Key: HIVE-12625
> URL: https://issues.apache.org/jira/browse/HIVE-12625
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12625.1-branch1.patch, HIVE-12625.2-branch1.patch, 
> HIVE-12625.3-branch1.patch, HIVE-12625.4-branch1.patch, 
> HIVE-12625.5-branch1.patch, HIVE-12625.6-branch1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12762) Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true

2016-01-01 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12762:

Attachment: HIVE-12762.patch

> Common join on parquet tables returns incorrect result when 
> hive.optimize.index.filter set to true
> --
>
> Key: HIVE-12762
> URL: https://issues.apache.org/jira/browse/HIVE-12762
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12762.patch, HIVE-12762.patch
>
>
> The following query will give incorrect result.
> {noformat}
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}
> We are enforcing to use common join and tbl2 will have 2 files after 2 
> insertions underneath.
> the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When  
>   hive.optimize.index.filter is set to true, we are incorrectly applying the 
> later filtering condition to each block, which causes no data is returned for 
> the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)