villebro opened a new pull request #9427: feat: Add post processing to QueryObject URL: https://github.com/apache/incubator-superset/pull/9427 ### CATEGORY Choose one - [ ] Bug Fix - [x] Enhancement (new features, refinement) - [ ] Refactor - [x] Add tests - [ ] Build / Development Environment - [ ] Documentation ### SUMMARY Currently the `/api/v1/query` endpoint doesn't support post-SQL data processing. This functionality is necessary for decoupling the backend from the frontend, as many of the data operations necessary for advanced visualizations often require data processing either not readily available in the JavaScript ecosystem, or are unfeasible due to network/computational expense. This PR adds post-query data processing functionality to Superset necessary for deprecating `viz.py`, namely - `aggregate` (same as SQL `GROUP BY`) - `pivot` (grouping by into column values and aggregation by cell value) - `sort` (same as `ORDER BY`) - `rolling` (e.g. moving sums, averages) This is done by leveraging functionality readily available in Pandas and Numpy. To leverage this functionality, post processing operations can be defined as part of the `queries` attribute in the `QueryContext` object. Below is an example from the unit tests, where the mean and 1st quantile are computed on an already aggregated query, which is lastly sorted in descending order by the 1st quantile value: ```python { "queries": [ { "granularity": "ds", "groupby": ["name", "state"], "metrics": [{"label": "sum__num"}], "filters": [], "row_limit": 100, "post_processing": [ { "operation": "aggregate", "options": { "groupby": ["state"], "aggregates": { "q1": { "operator": "percentile", "column": "sum__num", "options": {"q": 25}, }, "median": { "operator": "median", "column": "sum__num", }, }, }, }, { "operation": "sort", "options": { "by": ["q1", "state"], "ascending": {"q1": False}, }, }, ], } ], } ``` This feature should be seen as experimental at this stage. Furthermore, documentation will be added later, probably in the form of OpenAPI specs. ### TEST PLAN CI + local tests ### ADDITIONAL INFORMATION <!--- Check any relevant boxes with "x" --> <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue --> - [ ] Has associated issue: #9187 - [ ] Changes UI - [ ] Requires DB Migration. - [ ] Confirm DB Migration upgrade and downgrade tested. - [ ] Introduces new feature or API - [ ] Removes existing feature or API ### REVIEWERS @rusackas @suddjian @kristw @john-bodley @etr2460
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
