korbit-ai[bot] commented on code in PR #32670:
URL: https://github.com/apache/superset/pull/32670#discussion_r1996420476


##########
superset/commands/importers/v1/assets.py:
##########
@@ -79,12 +82,29 @@
             kwargs.get("ssh_tunnel_priv_key_passwords") or {}
         )
         self._configs: dict[str, Any] = {}
+        self.sparse = kwargs.get("sparse", False)
 
     # pylint: disable=too-many-locals
     @staticmethod
-    def _import(configs: dict[str, Any]) -> None:  # noqa: C901
+    def _import(configs: dict[str, Any], sparse: bool = False) -> None:  # 
noqa: C901
         # import databases first
         database_ids: dict[str, int] = {}
+        dataset_info: dict[str, dict[str, Any]] = {}
+        chart_ids: dict[str, int] = {}
+
+        if sparse:
+            existing_charts = db.session.query(Slice).all()
+            existing_datasets = db.session.query(SqlaTable).all()
+            existing_databases = db.session.query(Database).all()
+            chart_ids = {str(x.uuid): x.id for x in existing_charts}
+            database_ids = {str(x.uuid): x.id for x in existing_databases}

Review Comment:
   ### Inefficient bulk loading in sparse mode <sub>![category 
Performance](https://img.shields.io/badge/Performance-4f46e5)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The sparse import functionality queries all existing charts, datasets, and 
databases, which could lead to performance issues with large databases.
   
   ###### Why this matters
   Loading all records into memory at once for large installations could cause 
excessive memory usage and slow down the import process.
   
   ###### Suggested change ∙ *Feature Preview*
   Use batched queries to load records in chunks and implement a lookup 
mechanism. Example:
   
   ```python
   def get_existing_items_batched(model_class, batch_size=1000):
       offset = 0
       mapping = {}
       while True:
           batch = 
db.session.query(model_class).limit(batch_size).offset(offset).all()
           if not batch:
               break
           mapping.update({str(x.uuid): x.id for x in batch})
           offset += batch_size
       return mapping
   
   if sparse:
       chart_ids = get_existing_items_batched(Slice)
       database_ids = get_existing_items_batched(Database)
       dataset_info = {}
       for dataset in get_existing_items_batched(SqlaTable):
           dataset_info[str(dataset.uuid)] = {
               "datasource_id": dataset.id,
               "datasource_type": dataset.datasource_type,
               "datasource_name": dataset.datasource_name,
           }
   ```
   
   
   </details>
   
   <sub>
   
   [![Report a problem with this 
comment](https://img.shields.io/badge/Report%20a%20problem%20with%20this%20comment-gray.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIyNCIgaGVpZ2h0PSIyNCIgdmlld0JveD0iMCAwIDI0IDI0IiBmaWxsPSJub25lIiBzdHJva2U9IiNmNWVjMDAiIHN0cm9rZS13aWR0aD0iMiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBjbGFzcz0ibHVjaWRlIGx1Y2lkZS10cmlhbmdsZS1hbGVydCI+PHBhdGggZD0ibTIxLjczIDE4LTgtMTRhMiAyIDAgMCAwLTMuNDggMGwtOCAxNEEyIDIgMCAwIDAgNCAyMWgxNmEyIDIgMCAwIDAgMS43My0zIi8+PHBhdGggZD0iTTEyIDl2NCIvPjxwYXRoIGQ9Ik0xMiAxN2guMDEiLz48L3N2Zz4=)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/7aafef91-04ea-4a94-b999-8dfbc0f336f0?suggestedFixEnabled=true)
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:c791c099-da99-42f5-bc8a-8af1e6298894 -->
   



##########
superset/importexport/api.py:
##########
@@ -177,6 +177,7 @@ def import_(self) -> Response:
 
         if not contents:
             raise NoValidFilesFoundError()
+        sparse = request.form.get("sparse") == "true"

Review Comment:
   ### Brittle Boolean Parsing <sub>![category 
Functionality](https://img.shields.io/badge/Functionality-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The code uses string comparison ("true") for determining the sparse 
parameter value, which can be error-prone and miss other valid boolean 
representations.
   
   ###### Why this matters
   This strict string comparison could cause the sparse functionality to fail 
if clients send variations like 'True', '1', or other valid boolean 
representations, leading to unexpected import behavior.
   
   ###### Suggested change ∙ *Feature Preview*
   Use a more robust boolean parsing method that handles various truthy values:
   ```python
   def parse_boolean(value):
       if isinstance(value, bool):
           return value
       if isinstance(value, str):
           value = value.lower()
           if value in ('true', 't', '1', 'yes', 'y'):
               return True
           if value in ('false', 'f', '0', 'no', 'n'):
               return False
       return False
   
   sparse = parse_boolean(request.form.get("sparse"))
   ```
   
   
   </details>
   
   <sub>
   
   [![Report a problem with this 
comment](https://img.shields.io/badge/Report%20a%20problem%20with%20this%20comment-gray.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIyNCIgaGVpZ2h0PSIyNCIgdmlld0JveD0iMCAwIDI0IDI0IiBmaWxsPSJub25lIiBzdHJva2U9IiNmNWVjMDAiIHN0cm9rZS13aWR0aD0iMiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBjbGFzcz0ibHVjaWRlIGx1Y2lkZS10cmlhbmdsZS1hbGVydCI+PHBhdGggZD0ibTIxLjczIDE4LTgtMTRhMiAyIDAgMCAwLTMuNDggMGwtOCAxNEEyIDIgMCAwIDAgNCAyMWgxNmEyIDIgMCAwIDAgMS43My0zIi8+PHBhdGggZD0iTTEyIDl2NCIvPjxwYXRoIGQ9Ik0xMiAxN2guMDEiLz48L3N2Zz4=)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cdb3f938-6040-476a-8a7f-dc7990c777b5?suggestedFixEnabled=true)
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:bb304b23-8333-4ae1-b969-cabfa760fb1a -->
   



##########
superset/commands/importers/v1/assets.py:
##########
@@ -79,12 +82,29 @@
             kwargs.get("ssh_tunnel_priv_key_passwords") or {}
         )
         self._configs: dict[str, Any] = {}
+        self.sparse = kwargs.get("sparse", False)
 
     # pylint: disable=too-many-locals
     @staticmethod
-    def _import(configs: dict[str, Any]) -> None:  # noqa: C901
+    def _import(configs: dict[str, Any], sparse: bool = False) -> None:  # 
noqa: C901

Review Comment:
   ### Missing parameter documentation in _import method <sub>![category 
Documentation](https://img.shields.io/badge/Documentation-7c3aed)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The `sparse` parameter is not documented despite being a significant 
behavior flag.
   
   ###### Why this matters
   Without understanding what 'sparse' does, developers may misuse this 
parameter, leading to unexpected import behavior.
   
   ###### Suggested change ∙ *Feature Preview*
   @staticmethod
       def _import(configs: dict[str, Any], sparse: bool = False) -> None:  # 
noqa: C901
           """Import all the assets from the configs.
   
           Args:
               configs: The configuration dictionary for assets
               sparse: When True, uses existing assets (charts, datasets, 
databases) instead of recreating them
           """
   
   
   </details>
   
   <sub>
   
   [![Report a problem with this 
comment](https://img.shields.io/badge/Report%20a%20problem%20with%20this%20comment-gray.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIyNCIgaGVpZ2h0PSIyNCIgdmlld0JveD0iMCAwIDI0IDI0IiBmaWxsPSJub25lIiBzdHJva2U9IiNmNWVjMDAiIHN0cm9rZS13aWR0aD0iMiIgc3Ryb2tlLWxpbmVjYXA9InJvdW5kIiBzdHJva2UtbGluZWpvaW49InJvdW5kIiBjbGFzcz0ibHVjaWRlIGx1Y2lkZS10cmlhbmdsZS1hbGVydCI+PHBhdGggZD0ibTIxLjczIDE4LTgtMTRhMiAyIDAgMCAwLTMuNDggMGwtOCAxNEEyIDIgMCAwIDAgNCAyMWgxNmEyIDIgMCAwIDAgMS43My0zIi8+PHBhdGggZD0iTTEyIDl2NCIvPjxwYXRoIGQ9Ik0xMiAxN2guMDEiLz48L3N2Zz4=)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/70f2d925-c73c-4bc2-9760-1f797fd998c0?suggestedFixEnabled=true)
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:bf9da407-6b5e-416f-ad3c-df9f6eddfd12 -->
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to