JackieTien97 opened a new issue, #17337:
URL: https://github.com/apache/iotdb/issues/17337

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/iotdb/issues) and 
found nothing similar.
   
   
   ### Motivation
   
   When performing complex aggregation queries, standard SQL requires users to 
explicitly repeat all non-aggregated columns from the `SELECT` list inside the 
`GROUP BY` clause. In real-world development and data analysis scenarios, this 
introduces several pain points:
   
   1. **Tedious and Error-Prone Queries:** When grouping by multiple 
dimensional attributes or tags (e.g., `region`, `plant_id`, `device_model`, 
`status`), users have to write these fields twice. If a user updates the 
`SELECT` list during iteration but forgets to update the `GROUP BY` list, it 
results in a syntax error.
   2. **High Friction in Exploratory Data Analysis (EDA):** Data analysts 
frequently add or remove dimension columns to observe metrics at different 
granularities. Synchronizing the `SELECT` and `GROUP BY` clauses for every 
minor adjustment breaks the analytical flow and reduces efficiency.
   3. **Alignment with Modern Analytical Databases:** Mainstream modern OLAP 
and analytical databases (such as DuckDB, ClickHouse, etc.) already support the 
`GROUP BY ALL` syntax. It automatically infers the grouping columns, 
significantly simplifying SQL writing.
   
   With the introduction of IoTDB's Table Model, the relational semantics are 
highly geared towards data analysts. Introducing `GROUP BY ALL` will further 
elevate the ease of use and the overall developer experience within the Table 
Model.
   
   ### Solution
   
   We propose introducing the `GROUP BY ALL` syntax into the IoTDB SQL dialect. 
   **Note: To maintain architectural clarity and backward compatibility, this 
feature will ONLY be implemented within the Table Model. The existing Tree 
Model requires no changes.**
   
   The implementation will primarily focus on the frontend parsing and logical 
planning phases of the Table Model:
   
   1. **Update Antlr4 Grammar (Parser):** Modify the table-model-specific 
grammar file 
(`iotdb-core/relational-grammar/src/main/antlr4/org/apache/iotdb/db/relational/grammar/sql/RelationalSql.g4`)
 to support the `ALL` keyword within the `GROUP BY` clause rules.
   2. **Rewrite Analyzer Logic (Analyzer / Logical Planner):** Intercept the 
`GROUP BY ALL` syntax during the semantic analysis phase. If detected, the 
analyzer should iterate through the current `SELECT` clause:
      * Automatically extract all scalar expressions or column references that 
**do not contain aggregate functions**.
      * Implicitly convert these extracted expressions into Grouping Keys and 
inject them into the subsequent Logical Plan.
   3. **Execution Engine Transparency:** This feature is purely syntactic 
sugar. After the AST transformation by the Analyzer, the underlying logical 
plan and physical operators generated will be identical to a query where all 
columns are manually specified. Therefore, the physical execution layer and 
memory control require zero modifications.
   4. **Edge Case Handling:**
      * If the `SELECT` clause consists entirely of aggregate functions (e.g., 
`SELECT COUNT(s1), SUM(s2) FROM table GROUP BY ALL`), the analyzer should 
recognize that there are no non-aggregated columns and equivalently transform 
it into a global aggregation (a query without a `GROUP BY` clause).
      * Ensure seamless compatibility with other clauses, such as `ORDER BY`.
   
   **Example Usage:**
   
   ```sql
   USE plant_database;
   
   -- Current Syntax (Standard, but verbose)
   SELECT 
       region, 
       plant_id, 
       device_id, 
       AVG(temperature) AS avg_temp, 
       MAX(vibration) AS max_vib
   FROM sensor_data 
   WHERE time >= 10
   GROUP BY region, plant_id, device_id;
   
   -- Proposed Syntax (Using GROUP BY ALL)
   SELECT 
       region, 
       plant_id, 
       device_id, 
       AVG(temperature) AS avg_temp, 
       MAX(vibration) AS max_vib
   FROM sensor_data 
   WHERE time >= 10
   GROUP BY ALL;
   ```
   
   ### Alternatives
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to