korbit-ai[bot] commented on code in PR #32857:
URL: https://github.com/apache/superset/pull/32857#discussion_r2013692516
##########
superset/db_engine_specs/drill.py:
##########
@@ -144,3 +145,14 @@ def fetch_data(
if str(ex) == "generator raised StopIteration":
return []
raise
+
+ @staticmethod
+ def _mutate_label(label: str) -> str:
+ """
+ Suffix with the first six characters from the md5 of the label to avoid
+ collisions with original column names
+
+ :param label: Expected expression label
+ :return: Conditionally mutated label
+ """
+ return f"{label}_{md5_sha_from_str(label)[:6]}"
Review Comment:
### Insufficient Hash Length for Collision Prevention <sub></sub>
<details>
<summary>Tell me more</summary>
###### What is the issue?
The _mutate_label method could potentially generate collisions if two
different labels produce MD5 hashes that share the same first 6 characters.
###### Why this matters
While rare, MD5 hash collisions in the first 6 characters could lead to the
same functionality issue the code is trying to solve - duplicate column names
causing query errors.
###### Suggested change ∙ *Feature Preview*
Increase the number of characters used from the hash to reduce collision
probability. A recommended solution would be to use at least 8 characters:
```python
return f"{label}_{md5_sha_from_str(label)[:8]}"
```
###### Provide feedback to improve future suggestions
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/07964f89-9946-448c-95e6-dd5de52d1d8c/upvote)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/07964f89-9946-448c-95e6-dd5de52d1d8c?what_not_true=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/07964f89-9946-448c-95e6-dd5de52d1d8c?what_out_of_scope=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/07964f89-9946-448c-95e6-dd5de52d1d8c?what_not_in_standard=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/07964f89-9946-448c-95e6-dd5de52d1d8c)
</details>
<sub>
💬 Looking for more details? Reply to this comment to chat with Korbit.
</sub>
<!--- korbi internal id:d50c8dfe-a8bb-40cb-ac29-5c9c81129991 -->
[](d50c8dfe-a8bb-40cb-ac29-5c9c81129991)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]