codeant-ai-for-open-source[bot] commented on code in PR #37680: URL: https://github.com/apache/superset/pull/37680#discussion_r2766108878
########## superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py: ########## @@ -0,0 +1,79 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Add oboarding workflow for creating a dashboard with none existing chart + +Revision ID: a55bbb4da9a6 +Revises: 91b151e8dac1 +Create Date: 2026-02-04 17:02:54.527420 + +""" + +# revision identifiers, used by Alembic. +revision = 'a55bbb4da9a6' +down_revision = '91b151e8dac1' + +from alembic import op +import sqlalchemy as sa + +onboarding_workflow_table = sa.Table( + "onboarding_workflow", + sa.MetaData(), + sa.Column("id", sa.Integer, primary_key=True), + sa.Column("name", sa.String(100), unique=True), + sa.Column("description", sa.String(255)), +) + +user_onboarding_workflow_table = sa.Table( + "user_onboarding_workflow", + sa.MetaData(), + sa.Column("user_id", sa.Integer, sa.ForeignKey("ab_user.id")), + sa.Column("workflow_id", sa.Integer, sa.ForeignKey("onboarding_workflow.id")), + sa.Column("visited", sa.Boolean, default=False), + sa.Column("visited_times", sa.Integer, default=0), +) + + +def upgrade(): + bind = op.get_bind() + + # Insert the new onboarding workflow + insert_stmt = onboarding_workflow_table.insert().values( + name="CREATE_DASHBOARD_WITH_NO_EXISTING_CHART", + description="Onboarding workflow for creating a dashboard with none existing chart" + ) + + result = bind.execute(insert_stmt) + workflow_id = result.inserted_primary_key[0] + + # Insert a user_onboarding_workflow row for each existing user + # All active users should have this workflow in their onboarding list, + # but Gamma and Public users will not mark this workflow as visited since + # they won't have write permissions to the API. + insert_user_workflows = sa.text( + """ + INSERT INTO user_onboarding_workflow (user_id, workflow_id, visited, visited_times) + SELECT u.id, :workflow_id, false, 0 + FROM ab_user u + WHERE u.active = true + """ + ) + + bind.execute(insert_user_workflows, {"workflow_id": workflow_id}) Review Comment: **Suggestion:** The raw SQL uses literal boolean values `false` and `true`, which are not valid boolean literals on SQLite and will cause this migration to fail when running tests or using SQLite, so the booleans should be passed as bound parameters that SQLAlchemy can adapt per backend. [type error] <details> <summary><b>Severity Level:</b> Major ⚠️</summary> ```mdx - ❌ Migration fails on SQLite local or test environments. - ⚠️ Developer onboarding and unit tests blocked. - ⚠️ Postgres production unaffected by this syntax. ``` </details> ```suggestion SELECT u.id, :workflow_id, :visited, :visited_times FROM ab_user u WHERE u.active = :active """ ) bind.execute( insert_user_workflows, { "workflow_id": workflow_id, "visited": False, "visited_times": 0, "active": True, }, ) ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Run the migration against a SQLite-backed database (common in local developer setups or some test environments). The migration file is `superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py`. 2. During upgrade(), execution reaches the raw SQL block defined at lines 66-74 where the SQL text contains `false` and `true` literals. 3. bind.execute(insert_user_workflows, {"workflow_id": workflow_id}) is invoked at line 75. SQLite does not accept the bare `false`/`true` tokens in SQL the same way Postgres does; the backend will raise a SQL/Operational error or treat the tokens unexpectedly when the DB driver executes the raw SQL. 4. Observe migration failure in local dev or CI that uses SQLite: the upgrade aborts with a SQL error at `...:75`, blocking tests or developer workflows. Passing booleans as bound parameters (as in the improved code) lets SQLAlchemy adapt them to the backend. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py **Line:** 69:75 **Comment:** *Type Error: The raw SQL uses literal boolean values `false` and `true`, which are not valid boolean literals on SQLite and will cause this migration to fail when running tests or using SQLite, so the booleans should be passed as bound parameters that SQLAlchemy can adapt per backend. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> ########## superset/models/user_onboarding_workflow.py: ########## @@ -0,0 +1,46 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +from flask_appbuilder import Model +from sqlalchemy import Column, ForeignKey, Integer, Boolean +from sqlalchemy.orm import relationship + +from superset import security_manager +from superset.models.helpers import AuditMixinNullable + + +class UserOnboardingWorkflow(Model, AuditMixinNullable): + """ + Stores per-user state for onboarding workflows. + + This model tracks a user's interaction with a specific onboarding workflow, + including whether the workflow has been visited and how many times it has + been completed or re-run. It enables Superset to determine when onboarding + experiences should be shown automatically and allows users to re-trigger + workflows as needed. + + This table represents user-specific state only and should not contain any + workflow definition or UI configuration details. + """ + __tablename__ = "user_onboarding_workflow" + user_id = Column(Integer, ForeignKey("ab_user.id")) Review Comment: **Suggestion:** The ORM model does not define any primary key column, which will cause SQLAlchemy/Flask AppBuilder to fail mapper configuration at import time (e.g., "Mapper ... does not have any primary key columns") and prevent normal CRUD operations on this table; add a primary key column to the model. [logic error] <details> <summary><b>Severity Level:</b> Critical 🚨</summary> ```mdx - ❌ Superset webserver fails to import models on startup. - ❌ DB migrations involving this model fail. - ⚠️ Onboarding workflow CRUD unusable until fixed. ``` </details> ```suggestion id = Column(Integer, primary_key=True) ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Start the Superset process that imports models so SQLAlchemy/FAB maps ORM classes. The class mapping in `superset/models/user_onboarding_workflow.py` begins at `class UserOnboardingWorkflow` (file lines 25-46) and defines columns at lines 38-45. 2. During module import (application startup), SQLAlchemy inspects `UserOnboardingWorkflow` mapping (lines 25 and 38-45). Because no primary key Column is declared, SQLAlchemy's mapper validation runs against this class. 3. SQLAlchemy raises a Mapper configuration error (e.g., "Mapper 'UserOnboardingWorkflow' has no primary key") while importing `superset/models/user_onboarding_workflow.py` (around the same lines), preventing the app from finishing initialization or using the model. 4. As evidence: the only column-like attributes in this file are `user_id`, `workflow_id`, `visited`, `visited_times` (lines 39-40 and 44-45) and no `primary_key=True` is present; adding `id = Column(Integer, primary_key=True)` fixes the missing-primary-key mapper failure. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/models/user_onboarding_workflow.py **Line:** 39:39 **Comment:** *Logic Error: The ORM model does not define any primary key column, which will cause SQLAlchemy/Flask AppBuilder to fail mapper configuration at import time (e.g., "Mapper ... does not have any primary key columns") and prevent normal CRUD operations on this table; add a primary key column to the model. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> ########## superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py: ########## @@ -0,0 +1,79 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Add oboarding workflow for creating a dashboard with none existing chart + +Revision ID: a55bbb4da9a6 +Revises: 91b151e8dac1 +Create Date: 2026-02-04 17:02:54.527420 + +""" + +# revision identifiers, used by Alembic. +revision = 'a55bbb4da9a6' +down_revision = '91b151e8dac1' + +from alembic import op +import sqlalchemy as sa + +onboarding_workflow_table = sa.Table( + "onboarding_workflow", + sa.MetaData(), + sa.Column("id", sa.Integer, primary_key=True), + sa.Column("name", sa.String(100), unique=True), + sa.Column("description", sa.String(255)), +) + +user_onboarding_workflow_table = sa.Table( + "user_onboarding_workflow", + sa.MetaData(), + sa.Column("user_id", sa.Integer, sa.ForeignKey("ab_user.id")), + sa.Column("workflow_id", sa.Integer, sa.ForeignKey("onboarding_workflow.id")), + sa.Column("visited", sa.Boolean, default=False), + sa.Column("visited_times", sa.Integer, default=0), +) + + +def upgrade(): + bind = op.get_bind() + + # Insert the new onboarding workflow + insert_stmt = onboarding_workflow_table.insert().values( + name="CREATE_DASHBOARD_WITH_NO_EXISTING_CHART", + description="Onboarding workflow for creating a dashboard with none existing chart" + ) + + result = bind.execute(insert_stmt) + workflow_id = result.inserted_primary_key[0] + + # Insert a user_onboarding_workflow row for each existing user + # All active users should have this workflow in their onboarding list, + # but Gamma and Public users will not mark this workflow as visited since + # they won't have write permissions to the API. + insert_user_workflows = sa.text( + """ + INSERT INTO user_onboarding_workflow (user_id, workflow_id, visited, visited_times) + SELECT u.id, :workflow_id, false, 0 + FROM ab_user u + WHERE u.active = true + """ + ) + + bind.execute(insert_user_workflows, {"workflow_id": workflow_id}) + + +def downgrade(): + pass Review Comment: **Suggestion:** The downgrade function is a no-op, so rolling back this migration will leave the inserted onboarding workflow and all related user-onboarding rows in place, violating the expectation that migrations are reversible and potentially causing duplicate or stale data if a similar workflow is re-added later. [logic error] <details> <summary><b>Severity Level:</b> Major ⚠️</summary> ```mdx - ❌ Alembic migration rollback leaves stale onboarding rows. - ⚠️ Re-applying migration may trigger unique constraint error. - ⚠️ Local/CI rollback workflow/tests may be blocked. ``` </details> ```suggestion bind = op.get_bind() # Find the workflow ID created in upgrade workflow_id = bind.execute( sa.select(onboarding_workflow_table.c.id).where( onboarding_workflow_table.c.name == "CREATE_DASHBOARD_WITH_NO_EXISTING_CHART" ) ).scalar() if workflow_id is not None: # Remove user onboarding workflow rows for this workflow bind.execute( user_onboarding_workflow_table.delete().where( user_onboarding_workflow_table.c.workflow_id == workflow_id ) ) # Remove the onboarding workflow itself bind.execute( onboarding_workflow_table.delete().where( onboarding_workflow_table.c.id == workflow_id ) ) ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Apply the migration via Alembic: run `alembic upgrade head`. This executes the upgrade() in file `superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py` starting at line 50; the migration inserts the workflow at lines 54-60 and inserts per-user rows at lines 66-75. 2. Attempt to roll back the migration with `alembic downgrade 91b151e8dac1`. Alembic calls downgrade() defined at `..._add_create_dashboard_with_none_existing_.py:78-79`. Because downgrade() is a no-op (`pass`), the workflow row and user_onboarding_workflow rows inserted in step 1 remain in the database. 3. Re-run the migration (another `alembic upgrade head`) which hits the insert at `...py:54-60` again. Because the previously inserted workflow still exists, the insert may violate the workflow name uniqueness or otherwise lead to unexpected state (upgrade fails or produces duplicate-stale state depending on DB constraints). 4. Observe irreversible state: database still contains onboarding entries despite downgrade; test/rollback workflows and CI that expect reversible migrations will fail. This demonstrates the downgrade being a real, reproducible problem because the file's upgrade creates persistent rows but downgrade does not remove them. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/migrations/versions/2026-02-04_17-02_a55bbb4da9a6_add_create_dashboard_with_none_existing_.py **Line:** 79:79 **Comment:** *Logic Error: The downgrade function is a no-op, so rolling back this migration will leave the inserted onboarding workflow and all related user-onboarding rows in place, violating the expectation that migrations are reversible and potentially causing duplicate or stale data if a similar workflow is re-added later. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> ########## superset/migrations/versions/2026-02-04_16-32_91b151e8dac1_add_onboarding_workflows.py: ########## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Add onboarding workflows tables + +Revision ID: 91b151e8dac1 +Revises: 9787190b3d89 +Create Date: 2026-02-04 16:32:25.570254 + +""" + +# revision identifiers, used by Alembic. +revision = '91b151e8dac1' +down_revision = '9787190b3d89' + +from alembic import op +import sqlalchemy as sa + +def upgrade(): + op.create_table( + "onboarding_workflow", + sa.Column("created_on", sa.DateTime(), nullable=True), + sa.Column("changed_on", sa.DateTime(), nullable=True), + sa.Column("id", sa.Integer, primary_key=True), + sa.Column("name", sa.String(100), unique=True), + sa.Column("description", sa.String(255)), + ) + + op.create_table( + "user_onboarding_workflow", + sa.Column("created_on", sa.DateTime(), nullable=True), + sa.Column("changed_on", sa.DateTime(), nullable=True), + sa.Column("user_id", sa.Integer()), + sa.Column("workflow_id", sa.Integer()), + sa.Column("visited", sa.Boolean, default=False), + sa.Column("visited_times", sa.Integer, default=0), + sa.ForeignKeyConstraint(["user_id"], ["ab_user.id"]), Review Comment: **Suggestion:** The mapping table is created without any primary key, which allows duplicate rows for the same user/workflow pair and can break ORM assumptions that every row is uniquely identifiable, making updates and deletes error-prone. [logic error] <details> <summary><b>Severity Level:</b> Major ⚠️</summary> ```mdx - ❌ Onboarding workflow tracking may store duplicate rows. - ⚠️ ORM updates/deletes may target multiple rows unexpectedly. ``` </details> ```suggestion sa.PrimaryKeyConstraint("user_id", "workflow_id"), ``` <details> <summary><b>Steps of Reproduction ✅ </b></summary> ```mdx 1. Checkout the branch containing the migration file `superset/migrations/versions/2026-02-04_16-32_91b151e8dac1_add_onboarding_workflows.py` and open the op.create_table for `user_onboarding_workflow` (definition begins at file line 42 in the PR hunk). 2. Apply migrations (this migration uses the code at lines 42-52) by running the project's normal upgrade step (for example, `superset db upgrade` or the alembic equivalent). The `user_onboarding_workflow` table will be created without any primary key because the code at lines 42-52 does not declare one. 3. Using a SQL client (psql / mysql client) or SQLAlchemy engine, run two identical inserts into the new table, e.g.: - `INSERT INTO user_onboarding_workflow (user_id, workflow_id, visited, visited_times) VALUES (1, 1, true, 1);` - Run the same INSERT again. Because there is no primary key or unique constraint on (user_id, workflow_id), both INSERTs succeed and two duplicate rows appear. 4. Verify duplicates with `SELECT count(*) FROM user_onboarding_workflow WHERE user_id=1 AND workflow_id=1;` which returns 2. This demonstrates the concrete problem caused by missing PK at the migration code at lines 42-52. ``` </details> <details> <summary><b>Prompt for AI Agent 🤖 </b></summary> ```mdx This is a comment left during a code review. **Path:** superset/migrations/versions/2026-02-04_16-32_91b151e8dac1_add_onboarding_workflows.py **Line:** 50:50 **Comment:** *Logic Error: The mapping table is created without any primary key, which allows duplicate rows for the same user/workflow pair and can break ORM assumptions that every row is uniquely identifiable, making updates and deletes error-prone. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. ``` </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
