[
https://issues.apache.org/jira/browse/PHOENIX-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171700#comment-15171700
]
Haoran Zhang edited comment on PHOENIX-154 at 2/29/16 12:09 PM:
----------------------------------------------------------------
Thanks for your advice [~giacomotaylor].
By reading the material you provides, I have a better understand about this
issue and also have already generated an initial plan.
However, there are still several points that I'm quite confusing.
1. I plan to draft a proposal which will implement the window functions[1] when
the window is in the format of [ PARTITION BY expression [, expression ]* ] for
Apache Phenix. In other words, it adds support for the keyword: PARTITION BY. I
want to know whether the workload is enough for a GSOC term.
2. About this issue [PHOENIX-2700 |
https://issues.apache.org/jira/browse/PHOENIX-2700], I notice a suitable
solution is to implement the sliding window which can improve the performance
by reducing unnecessary data translation. However, in my opinion, it only works
when child query exists especially when the child query is OLAP query. For
example, if we have a sample query like
{code:sql}
SELECT country_name,
state_name,
county_name,
Sum(population)
OVER (
PARTITION BY country_name) AS country_population,
Sum(population)
OVER (
PARTITION BY state_name) AS state_population,
Sum(population)
OVER (
PARTITION BY county_name ) AS county_population
{code}
In this case, I think the sliding window may not benefit the performance. The
sliding window is not the basis of window functions, but the improvement.
Is that right?
3. When we have an SQL contains 'PARTITION BY partition_key', I think we should
guarantee each partion_key only spread in only one region server, otherwise,
the situation could be quite tricky. Nonetheless, I can't find an appropriate
way to guarantee it. If we have a restriction in DDL it is not a universal
solution. If we just throw an exception, it is not user-friendly. Would you
mind giving me any suggestions?
Thanks
[1] [https://calcite.apache.org/docs/reference.html#window-functions]
was (Author: rcheungit):
Thanks for your advice [~giacomotaylor].
By reading the material you provides, I have a better understand about this
issue and also have already generated an initial plan.
However, there are still several points that I'm quite confusing.
1. I plan to draft a proposal which will implement the window functions[1] when
the window is in the format of [ PARTITION BY expression [, expression ]* ] for
Apache Phenix. In other words, it adds support for the keyword: PARTITION BY. I
want to know whether the workload is enough for a GSOC term.
2. About this issue [PHOENIX-2700 |
https://issues.apache.org/jira/browse/PHOENIX-2700], I notice a suitable
solution is to implement the sliding window which can improve the performance
by reducing unnecessary data translation. However, in my opinion, it only works
when child query exists especially when the child query is QLAP query. For
example, if we have a sample query like
{code:sql}
SELECT country_name,
state_name,
county_name,
Sum(population)
OVER (
PARTITION BY country_name) AS country_population,
Sum(population)
OVER (
PARTITION BY state_name) AS state_population,
Sum(population)
OVER (
PARTITION BY county_name ) AS county_population
{code}
In this case, I think the sliding window may not benefit the performance. The
sliding window is not the basis of window functions, but the improvement.
Is that right?
3. When we have an SQL contains 'PARTITION BY partition_key', I think we should
guarantee each partion_key only spread in only one region server, otherwise,
the situation could be quite tricky. Nonetheless, I can't find an appropriate
way to guarantee it. If we have a restriction in DDL it is not a universal
solution. If we just throw an exception, it is not user-friendly. Would you
mind giving me any suggestions?
Thanks
[1] [https://calcite.apache.org/docs/reference.html#window-functions]
> Support SQL OLAP extensions
> ---------------------------
>
> Key: PHOENIX-154
> URL: https://issues.apache.org/jira/browse/PHOENIX-154
> Project: Phoenix
> Issue Type: New Feature
> Reporter: James Taylor
> Labels: gsoc2016
>
> Support the WINDOW, PARTITION OVER, GROUPING, RANK, DENSE RANK, ORDER BY etc.
> functionality.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)