[jira] [Updated] (PHOENIX-654) Minimize projection into scan for VIEW

James Taylor (JIRA) Fri, 16 May 2014 15:21:13 -0700

     [ 
https://issues.apache.org/jira/browse/PHOENIX-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


James Taylor updated PHOENIX-654:
---------------------------------

    Description: 
When you create a TABLE, we insert an empty key value into the first column 
family that we can count on being there for every row. For a VIEW, we don't do 
that, so we just fall back on projecting everything into a scan. If there are 
lots of columns (for example, 60,000 in 
[this](https://groups.google.com/forum/_!topic/phoenix-hbase-user/JgQjlqC4-uw) 
case), the scan is very slow.

Instead, we should only project everything when absolutely necessary, in these 
cases:
   * IS NULL expression
   * CASE WHEN with an ELSE expression
   * Usages of row value constructor
* When a column in the primary key is used
* When there is no where clause
* When there is a group by of a nullable expression

We could potentially do the same for a TABLE, but the empty key value seems 
like a better trade off as far as performance goes. In addition, we need the 
empty key value as a row cannot exist without at least one key value, making it 
impossible to support use cases that only define a primary key.

  was:
When you create a TABLE, we insert an empty key value into the first column 
family that we can count on being there for every row. For a VIEW, we don't do 
that, so we just fall back on projecting everything into a scan. If there are 
lots of columns (for example, 60,000 in 
[this](https://groups.google.com/forum/_!topic/phoenix-hbase-user/JgQjlqC4-uw) 
case), the scan is very slow.

Instead, we should only project everything when absolutely necessary, in these 
cases:
* When the EvaluateOnCompletionVisitor is run over the where clause expression 
returns true for visitor.evaluateOnCompletion(). This captures cases such as:
   * IS NULL check
   * CASE WHEN ELSE
   * Usages of row value constructor
* When there is no where clause
* When there is a group by of a nullable expression

We could potentially do the same for a TABLE, but the empty key value seems 
like a better trade off as far as performance goes. In addition, we need the 
empty key value as a row cannot exist without at least one key value, making it 
impossible to support use cases that only define a primary key.


> Minimize projection into scan for VIEW
> --------------------------------------
>
>                 Key: PHOENIX-654
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-654
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>
> When you create a TABLE, we insert an empty key value into the first column 
> family that we can count on being there for every row. For a VIEW, we don't 
> do that, so we just fall back on projecting everything into a scan. If there 
> are lots of columns (for example, 60,000 in 
> [this](https://groups.google.com/forum/_!topic/phoenix-hbase-user/JgQjlqC4-uw)
>  case), the scan is very slow.
> Instead, we should only project everything when absolutely necessary, in 
> these cases:
>    * IS NULL expression
>    * CASE WHEN with an ELSE expression
>    * Usages of row value constructor
> * When a column in the primary key is used
> * When there is no where clause
> * When there is a group by of a nullable expression
> We could potentially do the same for a TABLE, but the empty key value seems 
> like a better trade off as far as performance goes. In addition, we need the 
> empty key value as a row cannot exist without at least one key value, making 
> it impossible to support use cases that only define a primary key.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PHOENIX-654) Minimize projection into scan for VIEW

Reply via email to