[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-16 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-13884:
--
Labels:   (was: TODOC2.2)

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
Labels: TODOC2.2  (was: )

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Tests are not related to this patch. I tested locally, and previous jobs are 
failing with them too.

Thanks guys for the review. I committed this to 2.2

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
Attachment: HIVE-13884.10.patch

Here's another patch the fixes the tests, and it avoids that the configuration 
variable is freely modified by the client. Only the HMS server admins should be 
able to do so from the HMS hive-site.xml.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
Status: Open  (was: Patch Available)

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
Status: Patch Available  (was: Open)

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13884:
---
Attachment: HIVE-13884.9.patch

Here's another patch that includes some tests on TestHiveMetaStore. There was 
an issue found by those tests when requesting a MAX number of partitions. It is 
fixed now.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-28 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13884:
---
Summary: Disallow queries in HMS fetching more than a configured number of 
partitions  (was: Disallow queries fetching more than a configured number of 
partitions in PartitionPruner)

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)