[GitHub] [superset] marregui commented on a diff in pull request #24172: feat: add connector for QuestDB timeseries database

via GitHub Tue, 23 May 2023 08:58:28 -0700


marregui commented on code in PR #24172:
URL: https://github.com/apache/superset/pull/24172#discussion_r1202614938



##########
RELEASING/release-notes-2-0/changelog.md:
##########
@@ -128,6 +128,7 @@ under the License.
 - [#19408](https://github.com/apache/superset/pull/19408) feat(dashboard): 
Implement empty states for empty tabs (@kgabryje)
 - [#19446](https://github.com/apache/superset/pull/19446) feat(explore): Move 
chart actions into dropdown (@kgabryje)
 - [#19394](https://github.com/apache/superset/pull/19394) feat(explore): UI 
changes in dataset panel on Explore page (@kgabryje)
+- [#24172](https://github.com/apache/superset/pull/24172) feat: add connector 
for QuestDB timeseries database (@marregui)

Review Comment:
   hi @john-bodley and @villebro thank you very much for the quick reviews!!
   
   Ok regarding recording future changes, I will remove the changelog, I 
searched for other integrations and I assumed I was supposed to modify the file 
too, apologies.
   
   I am going to add **unit** test as well as **integration** tests, however I 
find that these tests do have a dependency to the SQLAlchemy dialect for 
QuestDB (https://pypi.org/project/questdb-connect/), and thus I would need to 
add the dependency to `requirements/testing.txt`, is this correct?
   
   One other interesting problem I am facing is the following. QuestDB has this 
SQL extension to aggregate time:
   
   ```sql
   SELECT avg(balance) FROM accounts SAMPLE BY 1M;
   ```
   
   in particular the statement returns the simple average balance from a list 
of accounts aggregated in one month buckets. In addition the GROUP BY clause is 
optional and can be omitted as the optimizer derives group-by elements from the 
SELECT clause. In standard SQL, users might write a query like the following:
   
   ```sql
   SELECT a, b, c, d, sum(e) FROM tab GROUP BY a, b, c, d;
   ```
   
   which in QuestDB could be written as:
   
   ```sql
   SELECT a, b, c, d, sum(e) FROM tab;
   ```
   
   lastly, we do not have HAVING, so something like this:
   
   ```sql
   SELECT a, b, c, d, sum(e)
   FROM tab
   GROUP BY a, b, c, d
   HAVING sum(e) > 100;
   ```
   
   would look like this in QuestDB sql:
   
   ```sql
   (SELECT a, b, c, d, sum(e) s FROM tab) WHERE s > 100;
   ```
   
   The problem I am facing is that the SQL that results from configuring the 
datasets into plots, from superset, does not exploit QuestDB's special syntax 
and the result is sub-optimal retrieval speed. I can get around this with 
SQLLab. It would be awesome to kind of have native support for these, but that 
would likely involve changes in the frontend.
   
   I am awaiting for https://github.com/questdb/questdb/pull/3390 to be merged 
and then I will cascade back to this, ETA today/tomorrow I would like to have a 
final version for reviewing.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [superset] marregui commented on a diff in pull request #24172: feat: add connector for QuestDB timeseries database

Reply via email to