dosubot[bot] commented on issue #30377:
URL: https://github.com/apache/superset/issues/30377#issuecomment-2371761128

   <!-- Greeting -->
   Hey @luizcapu! 👋 I'm here to help with bugs, answer questions, and assist 
you while you wait for a human maintainer. Let's squash those bugs together!
   
   <!-- Answer -->
   The `get_or_create` Dataset API endpoint returns a `500 - Internal Server 
Error` when trying to create a dataset with the same `table_name` but a 
different `schema` because the current implementation only checks for the 
existence of a table by its `table_name` and `database_id`. It does not account 
for differences in the `schema`. If a table with the same `table_name` already 
exists, it returns the existing table's ID without considering the schema, 
leading to potential conflicts or errors 
<sup>[[1]](https://github.com/apache/superset/blob/master/superset/datasets/api.py)</sup>.
   
   The relevant code snippet is:
   
   ```python
   @expose("/get_or_create/", methods=("POST",))
   @protect()
   @safe
   @statsd_metrics
   @event_logger.log_this_with_context(
       action=lambda self, *args, **kwargs: f"{self.__class__.__name__}"
       f".get_or_create_dataset",
       log_to_statsd=False,
   )
   def get_or_create_dataset(self) -> Response:
       """Retrieve a dataset by name, or create it if it does not exist.
       ---
       post:
         summary: Retrieve a table by name, or create it if it does not exist
         requestBody:
           required: true
           content:
             application/json:
               schema:
                 $ref: '#/components/schemas/GetOrCreateDatasetSchema'
         responses:
           200:
             description: The ID of the table
             content:
               application/json:
                 schema:
                   type: object
                   properties:
                     result:
                       type: object
                       properties:
                         table_id:
                           type: integer
           400:
             $ref: '#/components/responses/400'
           401:
             $ref: '#/components/responses/401'
           422:
             $ref: '#/components/responses/422'
           500:
             $ref: '#/components/responses/500'
       """
       try:
           body = GetOrCreateDatasetSchema().load(request.json)
       except ValidationError as ex:
           return self.response(400, message=ex.messages)
       table_name = body["table_name"]
       database_id = body["database_id"]
       if table := DatasetDAO.get_table_by_name(database_id, table_name):
           return self.response(200, result={"table_id": table.id})
   ```
   
   The code only checks for the existence of a table by `table_name` and 
`database_id`:
   
   ```python
   if table := DatasetDAO.get_table_by_name(database_id, table_name):
       return self.response(200, result={"table_id": table.id})
   ```
   
   This logic does not consider the `schema`, which can lead to a `500 - 
Internal Server Error` if there is a conflict or inconsistency when trying to 
create a dataset with the same `table_name` but a different `schema` 
<sup>[[1]](https://github.com/apache/superset/blob/master/superset/datasets/api.py)</sup>.
   
   Additionally, the test case `test_create_dataset_same_name_different_schema` 
in the integration tests demonstrates that creating datasets with the same 
`table_name` but different `schema` is a valid scenario, which the current 
implementation does not handle correctly 
<sup>[[2]](https://github.com/apache/superset/blob/master/tests/integration_tests/datasets/api_tests.py)</sup>.
   
   
   <!-- Dosu Comment Footer -->
   *To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to