[ 
https://issues.apache.org/jira/browse/CALCITE-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579068#comment-17579068
 ] 

Julian Hyde commented on CALCITE-4496:
--------------------------------------

[~jaceksan], Thanks for reaching out. It would be great to have your 
participation.

In order to collaborate, we need to figure out what are the key use cases of 
MAQL and how to express those using the proposed extensions to SQL (which I can 
calling "SQL-with-measures" or "analytic SQL"). I propose the following 
process: You write a list of MAQL queries that illustrate the key extensions. 
You and I collaborate on a document that maps those MAQL queries to SQL 
queries. We should start off with a meeting to workshop the document, then 
continue to collaborate offline.

In the spirit of openness, others are welcome to attend the meeting, and the 
document is readable to the world. I am happy to repeat the same process with 
any other BI vendors or projects.

Does that process sound right to you?

> Measure columns ("SELECT ... AS MEASURE")
> -----------------------------------------
>
>                 Key: CALCITE-4496
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4496
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Priority: Major
>
> In multi-dimensional languages such as MDX, DAX, Tableau, you can define 
> calculations in your models that can be re-evaluated in other dimensional 
> contexts. (The models are often called cubes, and the calculations are often 
> called measures.)
> In SQL, the model is a view (or a sub-query in the FROM clause) but the 
> columns are just values. Suppose you have a private {{Employees}} table, a 
> {{Departments}} view that rolls {{Employees}} up to department level and has 
> an {{averageSalary}} column. Now suppose you wish to roll up 
> {{averageSalary}} to the region level. The values that went into 
> {{averageSalary}} are not available to you, either directly or indirectly, so 
> the best you can do is to average-the-averages.
> In this proposed (and experimental) feature, you can define a special kind of 
> column - a measure - in the SELECT list of a view (or sub-query in a FROM 
> clause), and it remains a calculation. When a query uses a measure column, 
> the calculation is re-evaluated in the context of that query.
> To some extent, this breaches the "black box" property of SQL views. 
> Hitherto, a SQL view can be replaced with a table that has the same contents, 
> and all queries that use that view will return the same results. That 
> property no longer holds. But the view remains a useful "hiding" abstraction, 
> and the rows that compose that view cannot be viewed directly.
> Like dimensional models, measures in SQL would allow high-level abstractions 
> such as key-performance indicators (KPIs) to be shared and composed. Unlike 
> dimensional models, the models remain relational, namely, it is still 
> possible to enumerate and count the rows in a model.
> Consider the following view and query that uses it:
> {code:sql}
> CREATE VIEW EmpSummary AS
> SELECT deptno,
>     job,
>     AVG(sal) AS avg_sal,
>     AVG(sal) AS MEASURE avg_sal_measure,
>     COUNT(*) + 1 AS MEASURE count_plus_one_measure
> FROM Emp
> GROUP BY deptno, job;
> SELECT deptno,
>     AVG(avg_sal) AS a1,
>     AGGREGATE(avg_sal_measure) AS a2,
>     AGGREGATE(count_plus_one_measure) AS c1
> FROM EmpSummary
> GROUP BY deptno;{code}
> Note that there is a special aggregate function, {{AGGREGATE}}, that rolls up 
> measures. Columns {{a1}} and {{a2}} will contain different values; the first 
> averages the averages, and the second computes the average from the raw data. 
> Column {{c1}} will return the number of employees in each department plus 
> one, not rolling up the "plus one" for each distinct job in the department.
> This is just a brief sketch illustrating the purpose of measures. This 
> feature is experimental, the syntax will no doubt change, and much of the 
> semantics (for example, what expressions are valid as measures, whether 
> measures remain measures they appear in the SELECT clause of an enclosing 
> query, and what is the "context" in which a measure is evaluated) need to be 
> ironed out.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to