Re: [PR] docs(db_engine_specs): restructure feature table for GitHub rendering [superset]

via GitHub Wed, 22 Oct 2025 16:50:49 -0700


korbit-ai[bot] commented on code in PR #35809:
URL: https://github.com/apache/superset/pull/35809#discussion_r2453597373



##########
superset/db_engine_specs/lib.py:
##########
@@ -209,9 +210,374 @@ def get_name(spec: type[BaseEngineSpec]) -> str:
     return spec.engine_name or spec.engine
 
 
+def format_markdown_table(headers: list[str], rows: list[list[Any]]) -> str:
+    """
+    Format headers and rows into a markdown table.
+    """
+    lines = []
+    lines.append("| " + " | ".join(headers) + " |")
+    lines.append("| " + " | ".join(["---"] * len(headers)) + " |")
+    for row in rows:
+        lines.append("| " + " | ".join(str(col) for col in row) + " |")
+    return "\n".join(lines)
+
+
+def generate_focused_table(
+    info: dict[str, dict[str, Any]],
+    feature_keys: list[str],
+    column_labels: list[str],
+    filter_fn: Any = None,
+    value_extractor: Any = None,
+    preserve_order: bool = False,
+) -> tuple[str, list[str]]:
+    """
+    Generate a focused markdown table with databases as rows.
+
+    Args:
+        info: Dictionary mapping database names to their feature info
+        feature_keys: List of feature keys to extract from db_info
+        column_labels: List of column header labels
+        filter_fn: Optional function to filter databases (receives db_info 
dict)
+        value_extractor: Optional function to extract value (receives db_info, 
key)
+
+    Returns:
+        Tuple of (markdown table string, list of excluded database names)
+    """
+    # Filter databases if filter function provided
+    filtered_info = {}
+    excluded_dbs = []
+
+    for db_name, db_info in info.items():
+        if filter_fn is None or filter_fn(db_info):
+            filtered_info[db_name] = db_info
+        else:
+            excluded_dbs.append(db_name)
+
+    if not filtered_info:
+        return "", excluded_dbs
+
+    # Build headers: Database + feature columns
+    headers = ["Database"] + column_labels
+
+    # Build rows
+    rows = []
+    # Sort by database name unless preserve_order is True
+    db_names = (
+        list(filtered_info.keys()) if preserve_order else 
sorted(filtered_info.keys())
+    )
+
+    for db_name in db_names:
+        db_info = filtered_info[db_name]
+        row = [db_name]
+
+        for key in feature_keys:
+            if value_extractor:
+                value = value_extractor(db_info, key)
+            else:
+                value = db_info.get(key, "")
+            row.append(value)
+
+        rows.append(row)
+
+    return format_markdown_table(headers, rows), excluded_dbs
+
+
+def calculate_support_level(db_info: dict[str, Any], feature_keys: list[str]) 
-> str:
+    """
+    Calculate support level for a group of features.
+
+    Returns: "Supported", "Partial", or "Not supported"
+    """
+    if not feature_keys:
+        return "Not supported"
+
+    # Handle time grain features specially
+    if all(k.startswith("time_grains.") for k in feature_keys):
+        grain_keys = [k.split(".", 1)[1] for k in feature_keys]
+        supported = sum(
+            1 for grain in grain_keys if db_info["time_grains"].get(grain, 
False)
+        )
+    else:
+        supported = sum(1 for k in feature_keys if db_info.get(k, False))
+
+    total = len(feature_keys)
+    if supported == 0:
+        return "Not supported"
+    elif supported == total:
+        return "Supported"
+    else:
+        return "Partial"
+
+
+def generate_feature_tables() -> str:
+    """
+    Generate multiple focused markdown tables organized by feature categories.
+
+    Returns a complete markdown document with 7 tables optimized for 
readability.
+    """
+    info = {}
+    for spec in sorted(load_engine_specs(), key=get_name):
+        info[get_name(spec)] = diagnose(spec)
+
+    # remove 3rd party DB engine specs
+    info = {k: v for k, v in info.items() if 
v["module"].startswith("superset")}
+
+    # Sort by score descending for overview table
+    sorted_info = dict(sorted(info.items(), key=lambda x: x[1]["score"], 
reverse=True))
+
+    output = []
+
+    # Table 1: Feature Overview
+    output.append("### Feature Overview\n")
+
+    # Define feature groups for summary
+    sql_basics = [
+        "joins",
+        "subqueries",
+        "alias_in_select",
+        "alias_in_orderby",
+        "cte_in_subquery",
+    ]
+    advanced_sql = [
+        "time_groupby_inline",
+        "alias_to_source_column",
+        "order_by_not_in_select",
+        "expressions_in_orderby",
+    ]
+    common_grains = [
+        f"time_grains.{g}"
+        for g in ["SECOND", "MINUTE", "HOUR", "DAY", "WEEK", "MONTH", 
"QUARTER", "YEAR"]
+    ]
+    extended_grains = [
+        f"time_grains.{g}"
+        for g in [
+            "FIVE_SECONDS",
+            "THIRTY_SECONDS",
+            "FIVE_MINUTES",
+            "TEN_MINUTES",
+            "FIFTEEN_MINUTES",
+            "THIRTY_MINUTES",
+            "HALF_HOUR",
+            "SIX_HOURS",
+            "WEEK_STARTING_SUNDAY",
+            "WEEK_STARTING_MONDAY",
+            "WEEK_ENDING_SATURDAY",
+            "WEEK_ENDING_SUNDAY",
+            "QUARTER_YEAR",
+        ]
+    ]
+    integrations = [
+        "ssh_tunneling",
+        "query_cancelation",
+        "get_metrics",
+        "get_extra_table_metadata",
+        "dbapi_exception_mapping",
+        "custom_errors",
+        "dynamic_schema",
+        "where_latest_partition",
+    ]
+    advanced_features = [
+        "user_impersonation",
+        "expand_data",
+        "query_cost_estimation",
+        "sql_validation",
+    ]
+
+    headers = [
+        "Database",
+        "Score",
+        "SQL Basics",
+        "Advanced SQL",
+        "Common Time Grains",
+        "Extended Time Grains",
+        "Integrations",
+        "Advanced Features",
+    ]
+    rows = []
+    for db_name, db_info in sorted_info.items():
+        row = [
+            db_name,
+            db_info["score"],
+            calculate_support_level(db_info, sql_basics),
+            calculate_support_level(db_info, advanced_sql),
+            calculate_support_level(db_info, common_grains),
+            calculate_support_level(db_info, extended_grains),
+            calculate_support_level(db_info, integrations),
+            calculate_support_level(db_info, advanced_features),
+        ]
+        rows.append(row)
+    output.append(format_markdown_table(headers, rows))
+
+    # Table 2: Database Information
+    output.append("\n### Database Information\n")
+
+    # Custom value extractor for database info to handle limit_method enum
+    def extract_db_info(db_info: dict[str, Any], key: str) -> str:
+        if key == "limit_method":
+            # Convert enum value to name
+            from superset.sql.parse import LimitMethod
+
+            return LimitMethod(db_info[key]).name
+        return db_info.get(key, "")

Review Comment:
   ### Import statement inside frequently called function <sub>![category 
Performance](https://img.shields.io/badge/Performance-4f46e5)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The import statement for LimitMethod is placed inside the function and 
executed on every call when key equals 'limit_method'.
   
   
   ###### Why this matters
   Repeated imports inside function calls create unnecessary overhead, 
especially when the function is called multiple times during table generation.
   
   ###### Suggested change ∙ *Feature Preview*
   Move the import to the top of the function or module level:
   
   ```python
   from superset.sql.parse import LimitMethod
   
   def extract_db_info(db_info: dict[str, Any], key: str) -> str:
       if key == "limit_method":
           return LimitMethod(db_info[key]).name
       return db_info.get(key, "")
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7927cbed-b54c-4152-af9d-03fd79593a31/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7927cbed-b54c-4152-af9d-03fd79593a31?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7927cbed-b54c-4152-af9d-03fd79593a31?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7927cbed-b54c-4152-af9d-03fd79593a31?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7927cbed-b54c-4152-af9d-03fd79593a31)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:87e30e14-88dd-4cb8-a78e-c25a3095013f -->
   
   
   [](87e30e14-88dd-4cb8-a78e-c25a3095013f)



##########
superset/db_engine_specs/lib.py:
##########
@@ -209,9 +210,374 @@ def get_name(spec: type[BaseEngineSpec]) -> str:
     return spec.engine_name or spec.engine
 
 
+def format_markdown_table(headers: list[str], rows: list[list[Any]]) -> str:
+    """
+    Format headers and rows into a markdown table.
+    """
+    lines = []
+    lines.append("| " + " | ".join(headers) + " |")
+    lines.append("| " + " | ".join(["---"] * len(headers)) + " |")
+    for row in rows:
+        lines.append("| " + " | ".join(str(col) for col in row) + " |")
+    return "\n".join(lines)
+
+
+def generate_focused_table(
+    info: dict[str, dict[str, Any]],
+    feature_keys: list[str],
+    column_labels: list[str],
+    filter_fn: Any = None,
+    value_extractor: Any = None,
+    preserve_order: bool = False,
+) -> tuple[str, list[str]]:

Review Comment:
   ### Imprecise Callable Type Hints <sub>![category 
Design](https://img.shields.io/badge/Design-0d9488)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The function uses 'Any' type for filter_fn and value_extractor parameters 
instead of proper callable type hints
   
   
   ###### Why this matters
   Using 'Any' reduces type safety and makes it harder for developers to 
understand the expected function signatures for these callbacks
   
   ###### Suggested change ∙ *Feature Preview*
   Replace Any with proper callable type hints:
   ```python
   from typing import Callable
   
   def generate_focused_table(
       info: dict[str, dict[str, Any]],
       feature_keys: list[str],
       column_labels: list[str],
       filter_fn: Callable[[dict[str, Any]], bool] | None = None,
       value_extractor: Callable[[dict[str, Any], str], Any] | None = None,
       preserve_order: bool = False,
   ) -> tuple[str, list[str]]:
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/f1725f35-54b9-424e-acb4-d615908bc292/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/f1725f35-54b9-424e-acb4-d615908bc292?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/f1725f35-54b9-424e-acb4-d615908bc292?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/f1725f35-54b9-424e-acb4-d615908bc292?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/f1725f35-54b9-424e-acb4-d615908bc292)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:b687e6c0-14d3-4bd8-a631-6ec2606b6ce4 -->
   
   
   [](b687e6c0-14d3-4bd8-a631-6ec2606b6ce4)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs(db_engine_specs): restructure feature table for GitHub rendering [superset]

Reply via email to